You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add more details to node upgrade procedure (#1586)
I got some questions around node upgrades on the community slack and
figured that we should capture the answers in the migration guide. I've
added some more context to the node upgrade procedure, going into more
detail for the behavior of Managed Node Groups.
Once this is merged in, I'll also upgrade the copy of this in the
registry.
Copy file name to clipboardExpand all lines: docs/eks-v3-migration.md
+40-16Lines changed: 40 additions & 16 deletions
Original file line number
Diff line number
Diff line change
@@ -69,22 +69,34 @@ Have a look at [Gracefully upgrading node groups](#gracefully-upgrading-node-gro
69
69
70
70
### Gracefully upgrading node groups
71
71
72
-
The `ManagedNodeGroup` component gracefully handles updates by default. EKS will:
73
-
- boot the updated replacement nodes
74
-
- cordon the old nodes to ensure no new pods get launched onto them
75
-
- drain the old nodes one-by-one
76
-
- shut down the empty old nodes
72
+
#### Managed Node Groups (`ManagedNodeGroup`)
77
73
78
-
The detailed update procedure can be seen in the [AWS docs](https://docs.aws.amazon.com/eks/latest/userguide/managed-node-update-behavior.html).
74
+
The `ManagedNodeGroup` component has different update behaviors depending on the type of change.
79
75
80
-
For self-managed node groups (i.e., the `NodeGroup` and `NodeGroupV2` components) you have two options:
76
+
For regular updates (e.g., scaling, labels):
77
+
* EKS will boot the updated replacement nodes
78
+
* Cordon old nodes to prevent new pod scheduling
79
+
* Drain all nodes in the node group simultaneously
80
+
* Shut down the empty old nodes
81
81
82
-
1. Update the node group in place. Pulumi does this by first creating the new replacement nodes and then shutting down the old ones which will move pods to the new nodes forcibly. This is the default behavior when node groups are updated.
83
-
2. Create a new node group and move your Pods to that group. Migrating to a new node group is more graceful than simply updating the node group in place. This is because the migration process taints the old node group as `NoSchedule` and drains the nodes gradually.
82
+
However, for certain changes like updating the AMI type (e.g., migrating from AL2 to AL2023) in-place updates are not supported and require a full replacement:
83
+
* A new node group will be created first
84
+
* The old node group will be deleted after the new one is ready
85
+
* EKS will drain all pods from the old node group simultaneously during deletion
84
86
85
-
The second option involves the following steps:
87
+
Note: The detailed update procedure can be seen in the [AWS docs](https://docs.aws.amazon.com/eks/latest/userguide/managed-node-update-behavior.html). If simultaneous draining of all nodes is not desirable for your workload, you should follow the graceful migration approach described [below](#graceful-upgrade).
86
88
87
-
1. Create the replacement node group side-by-side with the existing node group. When doing this you need to make sure that the two node groups are allowed to communicate with each other. You can achieve this in the following way:
89
+
#### Self-Managed Node Groups (`NodeGroup` and `NodeGroupV2`)
90
+
91
+
For self-managed node groups (i.e., the `NodeGroup` and `NodeGroupV2` components) Pulumi updates the node group in place. Pulumi does this by first creating the new replacement nodes and then shutting down the old ones which will move pods to the new nodes forcibly. This is the default behavior when node groups are updated.
92
+
93
+
Note: If you want to migrate to a new node group more gracefully, you can follow the steps below.
94
+
95
+
#### Graceful Upgrade
96
+
97
+
You can gracefully update your node groups by creating a new node group side-by-side with the existing node group and then draining the old node group gradually. This involves the following steps:
98
+
99
+
1. Create the replacement node group side-by-side with the existing node group. For self-managed node groups you need to make sure that the two node groups are allowed to communicate with each other. You can achieve this in the following way:
88
100
89
101
```ts
90
102
const oldNG =neweks.NodeGroupV2("old", {
@@ -117,12 +129,24 @@ const newToOld = new aws.vpc.SecurityGroupIngressRule("newToOld", {
117
129
});
118
130
```
119
131
120
-
2. Find the nodes of the old node group. First take a note of the name of the auto scaling group associated with that node group and then run the following AWS CLI command, replacing `$ASG_GROUP_NAME` with the actual name of the auto scaling group:
Take a note of the node group name and then run the following kubectl command, replacing `$NODE_GROUP_NAME` with the actual name of the node group:
137
+
138
+
```bash
139
+
kubectl get nodes -l eks.amazonaws.com/nodegroup=$NODE_GROUP_NAME -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}'
140
+
```
141
+
142
+
**For Self-Managed Node Groups:**
143
+
144
+
Take a note of the name of the auto scaling group associated with that node group and then run the following AWS CLI command, replacing `$ASG_GROUP_NAME` with the actual name of the auto scaling group:
3. Drain each of the nodes of the old node group one by one. This will mark the nodes as unschedulable and gracefully move pods to other nodes. For more information have a look at this article in the [kubernetes documentation](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/).
0 commit comments