✨ MachineHealthCheck supports checking Machine conditions#12275
✨ MachineHealthCheck supports checking Machine conditions#12275justinmir wants to merge 4 commits intokubernetes-sigs:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @justinmir! |
|
Hi @justinmir. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
00e4c09 to
b7b110b
Compare
b7b110b to
56608f2
Compare
|
/ok-to-test |
| // +kubebuilder:validation:MaxItems=100 | ||
| UnhealthyConditions []UnhealthyCondition `json:"unhealthyConditions,omitempty"` | ||
|
|
||
| // unhealthyMachineConditions contains a list of the machine conditions that determine |
There was a problem hiding this comment.
Currently not enough bandwidth for a full review, but a few quick comments already:
- we should only add this to v1beta2
- we have to add this in a few places. I would recommend to rebase on top of main and search for all occurences of "nodeUnhealthyConditions" and check in which places we also need "unhealthyMachineConditions" (it will be in a lot of these cases :))
MachineHealthCheck currently only allows checking Node conditions to validate if a machine is healthy. However, machine conditions capture conditions that do not exist on nodes, for example, control plane node conditions such as EtcdPodHealthy, SchedulerPodHealthy that can indicate if a controlplane machine has been created correctly. Adding support for Machine conditions enables us to perform remediation during control plane upgrades. This PR introduces a new fieldas part of the MachineHealthCheckSpec: - `UnhealthyMachineConditions` This will mirror the behavior of `UnhealthyNodeConditions` but the MachineHealthCheck controller will instead check the machine conditions.
023c191 to
097384d
Compare
|
/area machinehealthcheck |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Sorry have to focus on finishing up v1beta2 in time for the upcoming release for a while. I'll try to get back to this afterwards |
|
I will try to take a look in the next few days |
|
@justinmir I now have finally bandwith to review this PR :). Could you plesae do another rebase before? Thank you! |
|
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
|
@justinmir hey, do you need any help with taking this PR forward and ready for review? It would be great to rebase and get early reviews from maintainers if you have time. |
MachineHealthCheck currently only allows checking Node conditions to validate if a machine is healthy. However, machine conditions capture conditions that do not exist on nodes, for example, control plane node conditions such as EtcdPodHealthy, SchedulerPodHealthy that can indicate if a controlplane machine has been created correctly. Adding support for Machine conditions enables us to perform remediation during control plane upgrades. This PR introduces a new field as part of the MachineHealthCheckChecks: - `UnhealthyMachineConditions` This will mirror the behavior of `UnhealthyNodeConditions` but the MachineHealthCheck controller will instead check the machine conditions. This reimplements and extends the work originally proposed by @justinmir in PR kubernetes-sigs#12275. Co-authored-by: Justin Miron <justin.miron@reddit.com> Signed-off-by: Furkat Gofurov <furkat.gofurov@suse.com>
|
Thx for working on this so far! /close |
|
@sbueringer: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What this PR does / why we need it:
MachineHealthCheck currently only allows checking Node conditions to validate if a machine is healthy. However, machine conditions capture conditions that do not exist on nodes, for example, control plane node conditions such as EtcdPodHealthy, SchedulerPodHealthy that can indicate if a controlplane machine has been created correctly.
Adding support for Machine conditions enables us to perform remediation during control plane upgrades.
This PR introduces a new fieldas part of the MachineHealthCheckSpec:
UnhealthyMachineConditionsThis will mirror the behavior of
UnhealthyNodeConditionsbut the MachineHealthCheck controller will instead check the machine conditions.Which issue(s) this PR fixes:
Fixes #5450
Label(s) to be applied
/kind feature
/area machinehealthcheck
Notes for Reviewers
We updated the tests to validate the new
MachineHealthCheckcode paths forUnhealthyMachineConditionsin the following ways:UnhealthyNodeConditionswe also specify aUnhealthyMachineConditions.UnhealthyMachineConditionsfield since this is not specified in old APIs.