New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-3017: Pod Healthy Policy for PDBs #3018
Conversation
approvers: | ||
- "@liggitt" | ||
- "@soltysh" | ||
- "@wojtek-t" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to also open PRR file later to include KEP in the milestone (together with marking it as implementable): https://github.com/kubernetes/enhancements/tree/master/keps/prod-readiness/sig-apps
Please assing to me.
// covered by a PodDisruptionBudget. | ||
type PodHealthyPolicy string | ||
|
||
const ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to consider the other end? That is having MinReadySeconds
and PodAvailable
as PodHealthyPolicy
. The application would then end up with stricter constraints for eviction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow this. How would MinReadySeconds
and PodAvailable
solve the issue about when pods should be considered healthy enough to be covered by PDBs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was just a hypothetical scenario, not concerning the original issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM except some wording
Thank you @mortent
c9eb2fb
to
41821aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of nits, but this looks mostly good.
You only need to update the document with the new template and add the missing PRR file.
/hold
to make sure this doesn't land before adding that file
/lgtm
/approve
// whether they are Ready or not. Any pods that are in the Running | ||
// phase will be counted when computing "disruptionsAllowed" and | ||
// will be subject to the PDB for eviction. | ||
PodRunning PodHealthyPolicy = "PodRunning" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the current default, maybe add an info that empty or this means the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the disruption controller and the eviction api don't agree on what is needed for a pod to be healthy, so we don't have a policy here that reflect the current behavior. The idea is that by specifying a policy, users can get consistent behavior between the two.
/assign @wojtek-t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple minor comments, but requires adding PRR file too.
|
||
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? | ||
|
||
N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every feature, especially the ones that are extending the API have to be update->downgrade->upgrde tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this to describe a manual test to verify this, mostly based on looking at other keps. If automated tests are feasible, that would obviously be better.
|
||
###### What specific metrics should inform a rollback? | ||
|
||
N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can easily imagine a buggy implementatio and e.g. disruption-controller crashing if the field is set or sth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, added more details here.
|
||
N/A | ||
|
||
### Monitoring Requirements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not required for Alpha, but I recommend you to think about it when implementing Alpha. I will be asking for that for sure during Beta review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I will make sure this is addressed for beta.
41821aa
to
335ce7d
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mortent, soltysh, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracking Issue description,Provide supporting details for a feature in development labels:kind/feature
pod ?What would you like to be added?
pls description the issues.Feature requests are unlikely to make progress as issues. Please consider engaging with SIGs on slack and mailing lists, instead.A proposal that works through the design along with the implications of the change can be opened as a enhancementskepses-proposals-keps Why is this needed?
Add PodHealthyPolicy to PodDisruptionBudget