Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ require (
golang.org/x/oauth2 v0.34.0 // indirect
golang.org/x/sys v0.40.0 // indirect
golang.org/x/term v0.39.0 // indirect
golang.org/x/text v0.33.0 // indirect
golang.org/x/text v0.34.0 // indirect
golang.org/x/time v0.14.0 // indirect
golang.org/x/tools v0.41.0 // indirect
gomodules.xyz/jsonpatch/v2 v2.5.0 // indirect
Expand All @@ -140,3 +140,5 @@ require (
sigs.k8s.io/randfill v1.0.0 // indirect
sigs.k8s.io/structured-merge-diff/v6 v6.3.1 // indirect
)

replace sigs.k8s.io/karpenter => github.com/AndrewMitchell25/karpenter v0.0.0-20260304191417-cd640b0054df
8 changes: 4 additions & 4 deletions go.sum
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
github.com/AndrewMitchell25/karpenter v0.0.0-20260304191417-cd640b0054df h1:k96lLlCzT/ZoVsT8OWtmgzI8eErOFy6KdkhVKLwZJzA=
github.com/AndrewMitchell25/karpenter v0.0.0-20260304191417-cd640b0054df/go.mod h1:7HVTLcR8uNwHcnwjfaCqV2ICF3aOPvngK/J8CBXZraU=
github.com/Masterminds/semver/v3 v3.4.0 h1:Zog+i5UMtVoCU8oKka5P7i9q9HgrJeGzI9SA1Xbatp0=
github.com/Masterminds/semver/v3 v3.4.0/go.mod h1:4V+yj/TJE1HU9XfppCwVMZq3I84lprf4nC11bSS5beM=
github.com/Pallinder/go-randomdata v1.2.0 h1:DZ41wBchNRb/0GfsePLiSwb0PHZmT67XY00lCDlaYPg=
Expand Down Expand Up @@ -323,8 +325,8 @@ golang.org/x/text v0.13.0/go.mod h1:TvPlkZtksWOMsz7fbANvkp4WM8x/WCo/om8BMLbz+aE=
golang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=
golang.org/x/text v0.15.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=
golang.org/x/text v0.21.0/go.mod h1:4IBbMaMmOPCJ8SecivzSH54+73PCFmPWxNTLm+vZkEQ=
golang.org/x/text v0.33.0 h1:B3njUFyqtHDUI5jMn1YIr5B0IE2U0qck04r6d4KPAxE=
golang.org/x/text v0.33.0/go.mod h1:LuMebE6+rBincTi9+xWTY8TztLzKHc/9C1uBCG27+q8=
golang.org/x/text v0.34.0 h1:oL/Qq0Kdaqxa1KbNeMKwQq0reLCCaFtqu2eNuSeNHbk=
golang.org/x/text v0.34.0/go.mod h1:homfLqTYRFyVYemLBFl5GgL/DWEiH5wcsQ5gSh1yziA=
golang.org/x/time v0.14.0 h1:MRx4UaLrDotUKUdCIqzPC48t1Y9hANFKIRpNx+Te8PI=
golang.org/x/time v0.14.0/go.mod h1:eL/Oa2bBBK0TkX57Fyni+NgnyQQN4LitPmob2Hjnqw4=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
Expand Down Expand Up @@ -375,8 +377,6 @@ sigs.k8s.io/controller-runtime v0.22.4 h1:GEjV7KV3TY8e+tJ2LCTxUTanW4z/FmNB7l327U
sigs.k8s.io/controller-runtime v0.22.4/go.mod h1:+QX1XUpTXN4mLoblf4tqr5CQcyHPAki2HLXqQMY6vh8=
sigs.k8s.io/json v0.0.0-20250730193827-2d320260d730 h1:IpInykpT6ceI+QxKBbEflcR5EXP7sU1kvOlxwZh5txg=
sigs.k8s.io/json v0.0.0-20250730193827-2d320260d730/go.mod h1:mdzfpAEoE6DHQEN0uh9ZbOCuHbLK5wOm7dK4ctXE9Tg=
sigs.k8s.io/karpenter v1.9.1-0.20260220232539-5e12af134257 h1:Z7WZW+Hw8Naj3kOcHIZbHyIKwTDtzQzm0N9tgqdGZbY=
sigs.k8s.io/karpenter v1.9.1-0.20260220232539-5e12af134257/go.mod h1:5NVeUwDmwHGnGIiqZhYCVfRx1uE5f9zdZsUYI34isIo=
sigs.k8s.io/randfill v1.0.0 h1:JfjMILfT8A6RbawdsK2JXGBR5AQVfd+9TbzrlneTyrU=
sigs.k8s.io/randfill v1.0.0/go.mod h1:XeLlZ/jmk4i1HRopwe7/aU3H5n1zNUcX6TM94b3QxOY=
sigs.k8s.io/structured-merge-diff/v6 v6.3.1 h1:JrhdFMqOd/+3ByqlP2I45kTOZmTRLBUm5pvRjeheg7E=
Expand Down
57 changes: 50 additions & 7 deletions website/content/en/preview/concepts/disruption.md
Original file line number Diff line number Diff line change
Expand Up @@ -360,20 +360,51 @@ In this scenario, Karpenter cannot voluntary disrupt the node because:

As seen in this example, the more PDBs there are affecting a Node, the more difficult it will be for Karpenter to find an opportunity to perform voluntary disruption actions.

Secondly, you can block Karpenter from voluntarily disrupting and draining pods by adding the `karpenter.sh/do-not-disrupt: "true"` annotation to the pod.
You can treat this annotation as a single-pod, permanently blocking PDB.
Secondly, you can block Karpenter from voluntarily disrupting and draining pods by adding the `karpenter.sh/do-not-disrupt` annotation to the pod.
This annotation supports two formats:

| Format | Example | Behavior |
|--------|---------|----------|
| **Boolean** | `karpenter.sh/do-not-disrupt: "true"` | Provides permanent protection from disruption |
| **Duration** | `karpenter.sh/do-not-disrupt: "30m"` | Provides time-based protection for the specified duration after the pod starts running |

#### Duration-Based Protection

When using the duration format, the annotation will be "active" and pods will be protected from disruption for the specified time period after they start running (based on `pod.status.startTime`).
Once the duration expires, the annotation becomes inactive and the pod becomes eligible for disruption.
This is useful for workloads that need protection during startup or critical phases but can be safely disrupted later.

The duration value must be a valid Go `time.Duration` string. Supported formats include:

| Duration | Description |
|----------|-------------|
| `"5m"` | 5 minutes |
| `"1h"` | 1 hour |
| `"2h30m"` | 2 hours and 30 minutes |
| `"24h"` | 24 hours |
| `"1h30m45s"` | 1 hour, 30 minutes, and 45 seconds |

{{% alert title="Note" color="primary" %}}
If an invalid duration is specified, the annotation will be ignored and an event will be emitted on the pod indicating that the duration format is invalid.
{{% /alert %}}

#### Behavior and Consequences

You can treat this annotation as a single-pod blocking PDB that is active either permanently (boolean format) or temporarily while the duration hasn't elapsed (duration format).
This has the following consequences:
- Nodes with `karpenter.sh/do-not-disrupt` pods will be excluded from [Consolidation]({{<ref "#consolidation" >}}), and conditionally excluded from [Drift]({{<ref "#drift" >}}).
- Nodes with active `karpenter.sh/do-not-disrupt` pods will be excluded from [Consolidation]({{<ref "#consolidation" >}}), and conditionally excluded from [Drift]({{<ref "#drift" >}}).
- If the Node's owning NodeClaim has a [`terminationGracePeriod`]({{<ref "#terminationgraceperiod" >}}) configured, it will still be eligible for disruption via drift.
- Like pods with a blocking PDB, pods with the `karpenter.sh/do-not-disrupt` annotation will **not** be gracefully evicted by the [Termination Controller]({{<ref "#termination-controller">}}).
- Like pods with a blocking PDB, pods with an active `karpenter.sh/do-not-disrupt` annotation will **not** be gracefully evicted by the [Termination Controller]({{<ref "#termination-controller">}}).
Karpenter will not be able to complete termination of the node until one of the following conditions is met:
- All pods with the `karpenter.sh/do-not-disrupt` annotation are removed.
- All pods with the `karpenter.sh/do-not-disrupt` annotation are removed, or their annotation becomes inactive (duration has elapsed).
- All pods with the `karpenter.sh/do-not-disrupt` annotation have entered a [terminal phase](https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase) (`Succeeded` or `Failed`).
- The owning NodeClaim's [`terminationGracePeriod`]({{<ref "#terminationgraceperiod" >}}) has elapsed.

This is useful for pods that you want to run from start to finish without disruption.
Examples of pods that you might want to opt-out of disruption include an interactive game that you don't want to interrupt or a long batch job (such as you might have with machine learning) that would need to start over if it were interrupted.
#### Examples

This is useful for pods that you want to run from start to finish without disruption, or that need protection during critical startup phases.

**Permanent protection** - useful for interactive games or long-running batch jobs:
```yaml
apiVersion: apps/v1
kind: Deployment
Expand All @@ -384,6 +415,18 @@ spec:
karpenter.sh/do-not-disrupt: "true"
```

**Duration-based protection** - useful for workloads with critical startup phases:
```yaml
apiVersion: apps/v1
kind: Deployment
spec:
template:
metadata:
annotations:
# Protect for 30 minutes after pod starts running
karpenter.sh/do-not-disrupt: "30m"
```

{{% alert title="Note" color="primary" %}}
The `karpenter.sh/do-not-disrupt` annotation does **not** exclude nodes from the forceful disruption methods: [Expiration]({{<ref "#expiration" >}}), [Interruption]({{<ref "#interruption" >}}), [Node Repair](<ref "#node-repair" >), and manual deletion (e.g. `kubectl delete node ...`).
While both interruption and node repair have implicit upper-bounds on termination time, expiration and manual termination do not.
Expand Down
10 changes: 8 additions & 2 deletions website/content/en/preview/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -482,9 +482,15 @@ Review what [disruptions are](https://kubernetes.io/docs/concepts/workloads/pods

#### `karpenter.sh/do-not-disrupt` Annotation

If a pod exists with the annotation `karpenter.sh/do-not-disrupt: true` on a node, and a request is made to delete the node, Karpenter will not drain any pods from that node or otherwise try to delete the node. Nodes that have pods with a `do-not-disrupt` annotation are not considered for consolidation, though their unused capacity is considered for the purposes of running pods from other nodes which can be consolidated.
If a pod exists with an active `karpenter.sh/do-not-disrupt` annotation on a node, and a request is made to delete the node, Karpenter will not drain any pods from that node or otherwise try to delete the node. The annotation is considered "active" when:
- Set to `"true"` (permanent protection)
- Set to a valid duration (e.g., `"30m"`) and the pod has been running for less than that duration

If you want to terminate a node with a `do-not-disrupt` pod, you can simply remove the annotation and the deprovisioning process will continue.
Nodes that have pods with an active `do-not-disrupt` annotation are not considered for consolidation, though their unused capacity is considered for the purposes of running pods from other nodes which can be consolidated.

If you want to terminate a node with a `do-not-disrupt` pod, you can either remove the annotation from the pod or wait for duration-based protection to expire naturally, and the deprovisioning process will continue.

For more details on how this annotation works, see [Pod-Level Controls]({{<ref "./concepts/disruption#pod-level-controls" >}}) in the Disruption documentation.

#### Scheduling Constraints (Consolidation Only)

Expand Down
Loading