Skip to content

Commit 28a0528

Browse files
authored
Add design document to 'dynamic-mig' feature (#725)
* update documents Signed-off-by: limengxuan <[email protected]>
1 parent 659bef8 commit 28a0528

File tree

3 files changed

+158
-0
lines changed

3 files changed

+158
-0
lines changed

docs/develop/dynamic-mig.md

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
# NVIDIA GPU MPS and MIG dynamic slice plugin
2+
3+
## Special Thanks
4+
5+
This feature will not be implemented without the help of @sailorvii.
6+
7+
## Introduction
8+
9+
The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, so we chose the MPS and MIG. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. We want to develop an automatic slice plugin and create the slice when the user require it.
10+
For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, we consider the CPU, Mem, GPU memory and other user-defined resource.
11+
HAMi is done by using [hami-core](https://github.com/Project-HAMi/HAMi-core), which is a cuda-hacking library. But mig is also widely used across the world. A unified API for dynamic-mig and hami-core is needed.
12+
13+
## Targets
14+
15+
- CPU, Mem, and GPU combined schedule
16+
- GPU dynamic slice: Hami-core and MIG
17+
- Support node-level binpack and spread by GPU memory, CPU and Mem
18+
- A unified vGPU Pool different virtualization technics
19+
- Tasks can choose to use MIG, use HAMi-core, or use both.
20+
21+
### Config maps
22+
- hami-scheduler-device-configMap
23+
This configmap defines the plugin configurations including resourceName, and MIG geometries, and node-level configurations.
24+
25+
```yaml
26+
apiVersion: v1
27+
data:
28+
device-config.yaml: |
29+
nvidia:
30+
resourceCountName: nvidia.com/gpu
31+
resourceMemoryName: nvidia.com/gpumem
32+
resourceCoreName: nvidia.com/gpucores
33+
knownMigGeometries:
34+
- models: [ "A30" ]
35+
allowedGeometries:
36+
-
37+
- name: 1g.6gb
38+
memory: 6144
39+
count: 4
40+
-
41+
- name: 2g.12gb
42+
memory: 12288
43+
count: 2
44+
-
45+
- name: 4g.24gb
46+
memory: 24576
47+
count: 1
48+
- models: [ "A100-SXM4-40GB", "A100-40GB-PCIe", "A100-PCIE-40GB", "A100-SXM4-40GB" ]
49+
allowedGeometries:
50+
-
51+
- name: 1g.5gb
52+
memory: 5120
53+
count: 7
54+
-
55+
- name: 2g.10gb
56+
memory: 10240
57+
count: 3
58+
- name: 1g.5gb
59+
memory: 5120
60+
count: 1
61+
-
62+
- name: 3g.20gb
63+
memory: 20480
64+
count: 2
65+
-
66+
- name: 7g.40gb
67+
memory: 40960
68+
count: 1
69+
- models: [ "A100-SXM4-80GB", "A100-80GB-PCIe", "A100-PCIE-80GB"]
70+
allowedGeometries:
71+
-
72+
- name: 1g.10gb
73+
memory: 10240
74+
count: 7
75+
-
76+
- name: 2g.20gb
77+
memory: 20480
78+
count: 3
79+
- name: 1g.10gb
80+
memory: 10240
81+
count: 1
82+
-
83+
- name: 3g.40gb
84+
memory: 40960
85+
count: 2
86+
-
87+
- name: 7g.79gb
88+
memory: 80896
89+
count: 1
90+
nodeconfig:
91+
- name: nodeA
92+
operatingmode: hami-core
93+
- name: nodeB
94+
operatingmode: mig
95+
```
96+
97+
## Structure
98+
99+
<img src="./imgs/hami-dynamic-mig-structure.png" width = "400" />
100+
101+
## Examples
102+
103+
Dynamic mig is compatable with hami tasks, as the example below:
104+
Just Setting `nvidia.com/gpu` and `nvidia.com/gpumem`.
105+
106+
```yaml
107+
apiVersion: v1
108+
kind: Pod
109+
metadata:
110+
name: gpu-pod1
111+
spec:
112+
containers:
113+
- name: ubuntu-container1
114+
image: ubuntu:20.04
115+
command: ["bash", "-c", "sleep 86400"]
116+
resources:
117+
limits:
118+
nvidia.com/gpu: 2 # requesting 2 vGPUs
119+
nvidia.com/gpumem: 8000 # Each vGPU contains 8000m device memory (Optional,Integer)
120+
```
121+
122+
A task can decide only to use `mig` or `hami-core` by setting `annotations.nvidia.com/vgpu-mode` to corresponding value, as the example below shows:
123+
124+
```yaml
125+
apiVersion: v1
126+
kind: Pod
127+
metadata:
128+
name: gpu-pod1
129+
annotations:
130+
nvidia.com/vgpu-mode: "mig"
131+
spec:
132+
containers:
133+
- name: ubuntu-container1
134+
image: ubuntu:20.04
135+
command: ["bash", "-c", "sleep 86400"]
136+
resources:
137+
limits:
138+
nvidia.com/gpu: 2 # requesting 2 vGPUs
139+
nvidia.com/gpumem: 8000 # Each vGPU contains 8000m device memory (Optional,Integer
140+
```
141+
142+
## Procedures
143+
144+
The Procedure of a vGPU task which uses dynamic-mig is shown below:
145+
146+
<img src="./imgs/hami-dynamic-mig-procedure.png" width = "800" />
147+
148+
Note that after submited a task, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize.
149+
150+
If you submit the example on an empty A100-PCIE-40GB node, then it will select a GPU and chosse MIG template below:
151+
152+
```yaml
153+
2g.10gb : 3
154+
1g.5gb : 1
155+
```
156+
157+
Then start the container with 2g.10gb instances * 2
158+
Loading
Loading

0 commit comments

Comments
 (0)