Conversation
|
Preview deployment ready! Preview URL: https://pr-9010.d18coufmbnnaag.amplifyapp.com Built from commit |
|
|
||
| 1. Stop provisioning capacity in the **impaired** AZ | ||
| 2. Stop performing voluntary disruption in the **impaired** AZ. | ||
| 3. Stop performing voluntary disruption in the **unimpaired** AZs if the disruption relies on scheduling pods to the **impaired** AZ. |
There was a problem hiding this comment.
Why would you discontinue disrupting instances in unimpaired AZs, e.g. underutilized or empty? If an application relies on infrastructure in the impaired AZ, it won't get scheduled unless your scheduling requirements are flexible. Are you worried losing capacity in the unimpaired AZs during an outage?
There was a problem hiding this comment.
It would be preferable to stop all disruption, but that is not a hard requirement. To make that change we need integration with upstream, which we can come later as a supplement. I think there is an issue upstream for stopping disruption: kubernetes-sigs/karpenter#2497
This might make a natural addition to that
| 2. Stop performing voluntary disruption in the **impaired** AZ. | ||
| 3. Stop performing voluntary disruption in the **unimpaired** AZs if the disruption relies on scheduling pods to the **impaired** AZ. | ||
| 4. Pods with strict scheduling requirements that require capacity in the impaired AZ such as volume requirements or node affinities **should not** result in launch attempts | ||
| 5. If an option is set, pods with TSCs that require capacity in the impaired AZ should instead have capacity launched into unimpaired AZs while still maintaining skew between the remaining unimpaired AZs. |
There was a problem hiding this comment.
If cluster topology consists of 3 zones and 1 is impaired, how will pods get scheduled in the unimpaired zones (without changing the whenUnsatisfiable to scheduleAnyway)?
There was a problem hiding this comment.
This is something I am hoping to change upstream about TSCs
Fixes: #7271
Description
How was this change tested?
Does this change impact docs?
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.