|
1 | 1 | # EKS Cluster w/ Elastic Fabric Adapter
|
2 | 2 |
|
3 |
| -This example shows how to provision an Amazon EKS Cluster with an EFA-enabled nodegroup. |
4 |
| - |
5 |
| -## Prerequisites: |
6 |
| - |
7 |
| -Ensure that you have the following tools installed locally: |
8 |
| - |
9 |
| -1. [aws cli](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) |
10 |
| -2. [kubectl](https://Kubernetes.io/docs/tasks/tools/) |
11 |
| -3. [terraform](https://learn.hashicorp.com/tutorials/terraform/install-cli) |
| 3 | +This pattern demonstrates an Amazon EKS Cluster with an EFA-enabled nodegroup. |
12 | 4 |
|
13 | 5 | ## Deploy
|
14 | 6 |
|
15 |
| -To provision this example: |
16 |
| - |
17 |
| -```sh |
18 |
| -terraform init |
19 |
| -terraform apply |
20 |
| - |
21 |
| -``` |
22 |
| - |
23 |
| -Enter `yes` at command prompt to apply |
| 7 | +See [here](https://aws-ia.github.io/terraform-aws-eks-blueprints/main/getting-started/#prerequisites) for the prerequisites and steps to deploy this pattern. |
24 | 8 |
|
25 | 9 | ## Validate
|
26 | 10 |
|
27 |
| -1. Run `update-kubeconfig` command, using the Terraform provided Output, replace with your `$AWS_REGION` and your `$CLUSTER_NAME` variables. |
28 |
| - |
29 |
| -```sh |
30 |
| -aws eks --region <$AWS_REGION> update-kubeconfig --name <$CLUSTER_NAME> |
31 |
| -``` |
32 |
| - |
33 |
| -2. Test by listing Nodes in in the Cluster, you should see Fargate instances as your Cluster Nodes. |
34 |
| - |
35 |
| -```sh |
36 |
| -kubectl get nodes |
37 |
| -kubectl get nodes -o yaml | grep instance-type | grep node | grep -v f: |
38 |
| -``` |
39 |
| - |
40 |
| -Your nodes and node types will be listed: |
41 |
| - |
42 |
| -```text |
43 |
| -# kubectl get nodes |
44 |
| -NAME STATUS ROLES AGE VERSION |
45 |
| -ip-10-11-10-103.ec2.internal Ready <none> 4m1s v1.25.7-eks-a59e1f0 |
46 |
| -ip-10-11-19-28.ec2.internal Ready <none> 11m v1.25.7-eks-a59e1f0 |
47 |
| -ip-10-11-2-151.ec2.internal Ready <none> 11m v1.25.7-eks-a59e1f0 |
48 |
| -ip-10-11-2-18.ec2.internal Ready <none> 5m1s v1.25.7-eks-a59e1f0 |
49 |
| -# kubectl get nodes -o yaml | grep instance-type | grep node | grep -v f: |
50 |
| - node.kubernetes.io/instance-type: g5.8xlarge |
51 |
| - node.kubernetes.io/instance-type: m5.large |
52 |
| - node.kubernetes.io/instance-type: m5.large |
53 |
| - node.kubernetes.io/instance-type: g5.8xlarge |
54 |
| -``` |
55 |
| - |
56 |
| -You should see two EFA-enabled (in this example `g5.8xlarge`) nodes in the list. |
57 |
| -This verifies that you are connected to your EKS cluster and it is configured with EFA nodes. |
58 |
| - |
59 |
| -3. Deploy Kubeflow MPI Operator |
60 |
| - |
61 |
| -Kubeflow MPI Operator is required for running MPIJobs on EKS. We will use an MPIJob to test EFA. |
62 |
| -To deploy the MPI operator execute the following: |
63 |
| - |
64 |
| -```sh |
65 |
| -kubectl apply -f https://raw.githubusercontent.com/kubeflow/mpi-operator/v0.3.0/deploy/v2beta1/mpi-operator.yaml |
66 |
| -``` |
67 |
| - |
68 |
| -Output: |
69 |
| - |
70 |
| -```text |
71 |
| -namespace/mpi-operator created |
72 |
| -customresourcedefinition.apiextensions.k8s.io/mpijobs.kubeflow.org created |
73 |
| -serviceaccount/mpi-operator created |
74 |
| -clusterrole.rbac.authorization.k8s.io/kubeflow-mpijobs-admin created |
75 |
| -clusterrole.rbac.authorization.k8s.io/kubeflow-mpijobs-edit created |
76 |
| -clusterrole.rbac.authorization.k8s.io/kubeflow-mpijobs-view created |
77 |
| -clusterrole.rbac.authorization.k8s.io/mpi-operator created |
78 |
| -clusterrolebinding.rbac.authorization.k8s.io/mpi-operator created |
79 |
| -deployment.apps/mpi-operator created |
80 |
| -``` |
81 |
| - |
82 |
| -In addition to deploying the operator, please apply a patch to the mpi-operator clusterrole |
83 |
| -to allow the mpi-operator service account access to `leases` resources in the `coordination.k8s.io` apiGroup. |
84 |
| - |
85 |
| -```sh |
86 |
| -kubectl apply -f https://raw.githubusercontent.com/aws-samples/aws-do-eks/main/Container-Root/eks/deployment/kubeflow/mpi-operator/clusterrole-mpi-operator.yaml |
87 |
| -``` |
88 |
| - |
89 |
| -Output: |
90 |
| - |
91 |
| -```text |
92 |
| -clusterrole.rbac.authorization.k8s.io/mpi-operator configured |
93 |
| -``` |
94 |
| - |
95 |
| -4. Test EFA |
96 |
| - |
97 |
| -We will run two tests. The first one will show the presence of EFA adapters on our EFA-enabled nodes. The second will test EFA performance. |
98 |
| - |
99 |
| -5. EFA Info Test |
100 |
| - |
101 |
| -To run the EFA info test, execute the following commands: |
102 |
| - |
103 |
| -```sh |
104 |
| -kubectl apply -f https://raw.githubusercontent.com/aws-samples/aws-do-eks/main/Container-Root/eks/deployment/efa-device-plugin/test-efa.yaml |
105 |
| -``` |
106 |
| - |
107 |
| -Output: |
108 |
| - |
109 |
| -```text |
110 |
| -mpijob.kubeflow.org/efa-info-test created |
111 |
| -``` |
112 |
| - |
113 |
| -```sh |
114 |
| -kubectl get pods |
115 |
| -``` |
| 11 | +1. List the nodes by instance type: |
116 | 12 |
|
117 |
| -Output: |
| 13 | + ```sh |
| 14 | + kubectl get nodes -o yaml | grep instance-type | grep node | grep -v f: |
| 15 | + ``` |
118 | 16 |
|
119 |
| -```text |
120 |
| -NAME READY STATUS RESTARTS AGE |
121 |
| -efa-info-test-launcher-hckkj 0/1 Completed 2 37s |
122 |
| -efa-info-test-worker-0 1/1 Running 0 38s |
123 |
| -efa-info-test-worker-1 1/1 Running 0 38s |
124 |
| -``` |
125 |
| - |
126 |
| -Once the test launcher pod enters status `Running` or `Completed`, see the test logs using the command below: |
127 |
| - |
128 |
| -```sh |
129 |
| -kubectl logs -f $(kubectl get pods | grep launcher | cut -d ' ' -f 1) |
130 |
| -``` |
131 |
| - |
132 |
| -Output: |
133 |
| - |
134 |
| -```text |
135 |
| -Warning: Permanently added 'efa-info-test-worker-1.efa-info-test-worker.default.svc,10.11.13.224' (ECDSA) to the list of known hosts. |
136 |
| -Warning: Permanently added 'efa-info-test-worker-0.efa-info-test-worker.default.svc,10.11.4.63' (ECDSA) to the list of known hosts. |
137 |
| -[1,1]<stdout>:provider: efa |
138 |
| -[1,1]<stdout>: fabric: efa |
139 |
| -[1,1]<stdout>: domain: rdmap197s0-rdm |
140 |
| -[1,1]<stdout>: version: 116.10 |
141 |
| -[1,1]<stdout>: type: FI_EP_RDM |
142 |
| -[1,1]<stdout>: protocol: FI_PROTO_EFA |
143 |
| -[1,0]<stdout>:provider: efa |
144 |
| -[1,0]<stdout>: fabric: efa |
145 |
| -[1,0]<stdout>: domain: rdmap197s0-rdm |
146 |
| -[1,0]<stdout>: version: 116.10 |
147 |
| -[1,0]<stdout>: type: FI_EP_RDM |
148 |
| -[1,0]<stdout>: protocol: FI_PROTO_EFA |
149 |
| -``` |
150 |
| - |
151 |
| -This result shows that two EFA adapters are available (one for each worker pod). |
152 |
| - |
153 |
| -Lastly, delete the test job: |
154 |
| - |
155 |
| -```sh |
156 |
| -kubectl delete mpijob efa-info-test |
157 |
| -``` |
158 |
| - |
159 |
| -Output: |
160 |
| - |
161 |
| -```text |
162 |
| -mpijob.kubeflow.org "efa-info-test" deleted |
163 |
| -``` |
164 |
| - |
165 |
| -6. EFA NCCL Test |
166 |
| - |
167 |
| -To run the EFA NCCL test please execute the following kubectl command: |
168 |
| - |
169 |
| -```sh |
170 |
| -kubectl apply -f https://raw.githubusercontent.com/aws-samples/aws-do-eks/main/Container-Root/eks/deployment/efa-device-plugin/test-nccl-efa.yaml |
171 |
| -``` |
172 |
| - |
173 |
| -Output: |
174 |
| - |
175 |
| -```text |
176 |
| -mpijob.kubeflow.org/test-nccl-efa created |
177 |
| -``` |
178 |
| - |
179 |
| -Then display the pods in the current namespace: |
180 |
| - |
181 |
| -```sh |
182 |
| -kubectl get pods |
183 |
| -``` |
184 |
| - |
185 |
| -Output: |
186 |
| - |
187 |
| -```text |
188 |
| -NAME READY STATUS RESTARTS AGE |
189 |
| -test-nccl-efa-launcher-tx47t 1/1 Running 2 (31s ago) 33s |
190 |
| -test-nccl-efa-worker-0 1/1 Running 0 33s |
191 |
| -test-nccl-efa-worker-1 1/1 Running 0 33s |
192 |
| -``` |
193 |
| - |
194 |
| -Once the launcher pod enters `Running` or `Completed` state, execute the following to see the test logs: |
195 |
| - |
196 |
| -```sh |
197 |
| -kubectl logs -f $(kubectl get pods | grep launcher | cut -d ' ' -f 1) |
198 |
| -``` |
199 |
| - |
200 |
| -The following section from the beginning of the log, indicates that the test is being performed using EFA: |
201 |
| - |
202 |
| -```text |
203 |
| -[1,0]<stdout>:test-nccl-efa-worker-0:21:21 [0] NCCL INFO NET/OFI Selected Provider is efa (found 1 nics) |
204 |
| -[1,0]<stdout>:test-nccl-efa-worker-0:21:21 [0] NCCL INFO Using network AWS Libfabric |
205 |
| -[1,0]<stdout>:NCCL version 2.12.7+cuda11.4 |
206 |
| -``` |
207 |
| - |
208 |
| -Columns 8 and 12 in the output table show the in-place and out-of-place bus bandwidth calculated for the data size listed in column 1. In this case it is 3.13 and 3.12 GB/s respectively. |
209 |
| -Your actual results may be slightly different. The calculated average bus bandwidth is displayed at the bottom of the log when the test finishes after it reaches the max data size, |
210 |
| -specified in the mpijob manifest. In this result the average bus bandwidth is 1.15 GB/s. |
211 |
| - |
212 |
| -``` |
213 |
| -[1,0]<stdout>:# size count type redop root time algbw busbw #wrong time algbw busbw #wrong |
214 |
| -[1,0]<stdout>:# (B) (elements) (us) (GB/s) (GB/s) (us) (GB/s) (GB/s) |
215 |
| -... |
216 |
| -[1,0]<stdout>: 262144 65536 float sum -1 195.0 1.34 1.34 0 194.0 1.35 1.35 0 |
217 |
| -[1,0]<stdout>: 524288 131072 float sum -1 296.9 1.77 1.77 0 291.1 1.80 1.80 0 |
218 |
| -[1,0]<stdout>: 1048576 262144 float sum -1 583.4 1.80 1.80 0 579.6 1.81 1.81 0 |
219 |
| -[1,0]<stdout>: 2097152 524288 float sum -1 983.3 2.13 2.13 0 973.9 2.15 2.15 0 |
220 |
| -[1,0]<stdout>: 4194304 1048576 float sum -1 1745.4 2.40 2.40 0 1673.2 2.51 2.51 0 |
221 |
| -... |
222 |
| -[1,0]<stdout>:# Avg bus bandwidth : 1.15327 |
223 |
| -``` |
224 |
| - |
225 |
| -Finally, delete the test mpi job: |
226 |
| - |
227 |
| -```sh |
228 |
| -kubectl delete mpijob test-nccl-efa |
229 |
| -``` |
230 |
| - |
231 |
| -Output: |
232 |
| - |
233 |
| -```text |
234 |
| -mpijob.kubeflow.org "test-nccl-efa" deleted |
235 |
| -``` |
| 17 | + ```text |
| 18 | + node.kubernetes.io/instance-type: g5.8xlarge |
| 19 | + node.kubernetes.io/instance-type: m5.large |
| 20 | + node.kubernetes.io/instance-type: m5.large |
| 21 | + node.kubernetes.io/instance-type: g5.8xlarge |
| 22 | + ``` |
| 23 | + |
| 24 | + You should see two EFA-enabled (in this example `g5.8xlarge`) nodes in the list. |
| 25 | + |
| 26 | +2. Deploy Kubeflow MPI Operator |
| 27 | + |
| 28 | + Kubeflow MPI Operator is required for running MPIJobs on EKS. We will use an MPIJob to test EFA. |
| 29 | + To deploy the MPI operator execute the following: |
| 30 | + |
| 31 | + ```sh |
| 32 | + kubectl apply -f https://raw.githubusercontent.com/kubeflow/mpi-operator/v0.3.0/deploy/v2beta1/mpi-operator.yaml |
| 33 | + ``` |
| 34 | + |
| 35 | + ```text |
| 36 | + namespace/mpi-operator created |
| 37 | + customresourcedefinition.apiextensions.k8s.io/mpijobs.kubeflow.org created |
| 38 | + serviceaccount/mpi-operator created |
| 39 | + clusterrole.rbac.authorization.k8s.io/kubeflow-mpijobs-admin created |
| 40 | + clusterrole.rbac.authorization.k8s.io/kubeflow-mpijobs-edit created |
| 41 | + clusterrole.rbac.authorization.k8s.io/kubeflow-mpijobs-view created |
| 42 | + clusterrole.rbac.authorization.k8s.io/mpi-operator created |
| 43 | + clusterrolebinding.rbac.authorization.k8s.io/mpi-operator created |
| 44 | + deployment.apps/mpi-operator created |
| 45 | + ``` |
| 46 | + |
| 47 | + In addition to deploying the operator, please apply a patch to the mpi-operator clusterrole |
| 48 | + to allow the mpi-operator service account access to `leases` resources in the `coordination.k8s.io` apiGroup. |
| 49 | + |
| 50 | + ```sh |
| 51 | + kubectl apply -f https://raw.githubusercontent.com/aws-samples/aws-do-eks/main/Container-Root/eks/deployment/kubeflow/mpi-operator/clusterrole-mpi-operator.yaml |
| 52 | + ``` |
| 53 | + |
| 54 | + ```text |
| 55 | + clusterrole.rbac.authorization.k8s.io/mpi-operator configured |
| 56 | + ``` |
| 57 | + |
| 58 | +3. EFA test |
| 59 | + |
| 60 | + The results should shown that two EFA adapters are available (one for each worker pod) |
| 61 | + |
| 62 | + ```sh |
| 63 | + kubectl apply -f https://raw.githubusercontent.com/aws-samples/aws-do-eks/main/Container-Root/eks/deployment/efa-device-plugin/test-efa.yaml |
| 64 | + ``` |
| 65 | + |
| 66 | + ```text |
| 67 | + mpijob.kubeflow.org/efa-info-test created |
| 68 | + ``` |
| 69 | + |
| 70 | + Once the test launcher pod enters status `Running` or `Completed`, see the test logs using the command below: |
| 71 | + |
| 72 | + ```sh |
| 73 | + kubectl logs -f $(kubectl get pods | grep launcher | cut -d ' ' -f 1) |
| 74 | + ``` |
| 75 | + |
| 76 | + ```text |
| 77 | + Warning: Permanently added 'efa-info-test-worker-1.efa-info-test-worker.default.svc,10.11.13.224' (ECDSA) to the list of known hosts. |
| 78 | + Warning: Permanently added 'efa-info-test-worker-0.efa-info-test-worker.default.svc,10.11.4.63' (ECDSA) to the list of known hosts. |
| 79 | + [1,1]<stdout>:provider: efa |
| 80 | + [1,1]<stdout>: fabric: efa |
| 81 | + [1,1]<stdout>: domain: rdmap197s0-rdm |
| 82 | + [1,1]<stdout>: version: 116.10 |
| 83 | + [1,1]<stdout>: type: FI_EP_RDM |
| 84 | + [1,1]<stdout>: protocol: FI_PROTO_EFA |
| 85 | + [1,0]<stdout>:provider: efa |
| 86 | + [1,0]<stdout>: fabric: efa |
| 87 | + [1,0]<stdout>: domain: rdmap197s0-rdm |
| 88 | + [1,0]<stdout>: version: 116.10 |
| 89 | + [1,0]<stdout>: type: FI_EP_RDM |
| 90 | + [1,0]<stdout>: protocol: FI_PROTO_EFA |
| 91 | + ``` |
| 92 | + |
| 93 | +4. EFA NCCL test |
| 94 | + |
| 95 | + To run the EFA NCCL test please execute the following kubectl command: |
| 96 | + |
| 97 | + ```sh |
| 98 | + kubectl apply -f https://raw.githubusercontent.com/aws-samples/aws-do-eks/main/Container-Root/eks/deployment/efa-device-plugin/test-nccl-efa.yaml |
| 99 | + ``` |
| 100 | + |
| 101 | + ```text |
| 102 | + mpijob.kubeflow.org/test-nccl-efa created |
| 103 | + ``` |
| 104 | + |
| 105 | + Once the launcher pod enters `Running` or `Completed` state, execute the following to see the test logs: |
| 106 | + |
| 107 | + ```sh |
| 108 | + kubectl logs -f $(kubectl get pods | grep launcher | cut -d ' ' -f 1) |
| 109 | + ``` |
| 110 | + |
| 111 | + ```text |
| 112 | + [1,0]<stdout>:test-nccl-efa-worker-0:21:21 [0] NCCL INFO NET/OFI Selected Provider is efa (found 1 nics) |
| 113 | + [1,0]<stdout>:test-nccl-efa-worker-0:21:21 [0] NCCL INFO Using network AWS Libfabric |
| 114 | + [1,0]<stdout>:NCCL version 2.12.7+cuda11.4 |
| 115 | + ``` |
| 116 | + |
| 117 | + Columns 8 and 12 in the output table show the in-place and out-of-place bus bandwidth calculated for the data size listed in column 1. In this case it is 3.13 and 3.12 GB/s respectively. |
| 118 | + Your actual results may be slightly different. The calculated average bus bandwidth is displayed at the bottom of the log when the test finishes after it reaches the max data size, |
| 119 | + specified in the mpijob manifest. In this result the average bus bandwidth is 1.15 GB/s. |
| 120 | + |
| 121 | + ```text |
| 122 | + [1,0]<stdout>:# size count type redop root time algbw busbw #wrong time algbw busbw #wrong |
| 123 | + [1,0]<stdout>:# (B) (elements) (us) (GB/s) (GB/s) (us) (GB/s) (GB/s) |
| 124 | + ... |
| 125 | + [1,0]<stdout>: 262144 65536 float sum -1 195.0 1.34 1.34 0 194.0 1.35 1.35 0 |
| 126 | + [1,0]<stdout>: 524288 131072 float sum -1 296.9 1.77 1.77 0 291.1 1.80 1.80 0 |
| 127 | + [1,0]<stdout>: 1048576 262144 float sum -1 583.4 1.80 1.80 0 579.6 1.81 1.81 0 |
| 128 | + [1,0]<stdout>: 2097152 524288 float sum -1 983.3 2.13 2.13 0 973.9 2.15 2.15 0 |
| 129 | + [1,0]<stdout>: 4194304 1048576 float sum -1 1745.4 2.40 2.40 0 1673.2 2.51 2.51 0 |
| 130 | + ... |
| 131 | + [1,0]<stdout>:# Avg bus bandwidth : 1.15327 |
| 132 | + ``` |
236 | 133 |
|
237 | 134 | ## Destroy
|
238 | 135 |
|
239 |
| -To teardown and remove the resources created in this example: |
240 |
| - |
241 |
| -```sh |
242 |
| -terraform destroy -target module.eks_blueprints_addons -auto-approve |
243 |
| -terraform destroy -target module.eks -auto-approve |
244 |
| -terraform destroy -auto-approve |
245 |
| -``` |
| 136 | +{% |
| 137 | + include-markdown "../../docs/_partials/destroy.md" |
| 138 | +%} |
0 commit comments