Skip to content

Commit 98f80b5

Browse files
authored
Merge pull request #652 from uniemimu/hookupdate
add link to gpu_nfdhook and update hook README
2 parents efc7d79 + cbf7bab commit 98f80b5

File tree

2 files changed

+40
-12
lines changed

2 files changed

+40
-12
lines changed

cmd/gpu_nfdhook/README.md

Lines changed: 39 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,51 @@
11
# Intel GPU NFD hook
22

3-
This is the Node Feature Discovery binary hook implementation for the Intel
4-
GPUs. The intel-gpu-initcontainer which is built among other images can be
5-
placed as part of the gpu-plugin deployment, so that it copies this hook to the
6-
host system only in those hosts, in which also gpu-plugin is deployed.
3+
This is the [Node Feature Discovery](https://github.com/kubernetes-sigs/node-feature-discovery)
4+
binary hook implementation for the Intel GPUs. The intel-gpu-initcontainer which
5+
is built among other images can be placed as part of the gpu-plugin deployment,
6+
so that it copies this hook to the host system only in those hosts, in which also
7+
gpu-plugin is deployed.
78

89
When NFD worker runs this hook, it will add a number of labels to the nodes,
910
which can be used for example to deploy services to nodes with specific GPU
1011
types. Selected numeric labels can be turned into kubernetes extended resources
1112
by the NFD, allowing for finer grained resource management for GPU-using PODs.
1213

13-
In the NFD deployment, the hook requires /host-sys -folder to have the host /sys
14-
-folder content mounted, and /host-dev to have the host /dev -folder content
15-
mounted. Write access is not necessary.
14+
In the NFD deployment, the hook requires `/host-sys` -folder to have the host `/sys`-folder content mounted. Write access is not necessary.
15+
16+
## GPU memory
1617

1718
GPU memory amount is read from sysfs gt/gt* files and turned into a label.
18-
There are two supported environment variables named GPU_MEMORY_OVERRIDE and
19-
GPU_MEMORY_RESERVED. Both are supposed to hold numeric values. For systems with
19+
There are two supported environment variables named `GPU_MEMORY_OVERRIDE` and
20+
`GPU_MEMORY_RESERVED`. Both are supposed to hold numeric byte amounts. For systems with
2021
older kernel drivers or GPUs which do not support reading the GPU memory
21-
amount, the GPU_MEMORY_OVERRIDE environment variable value is turned into a GPU
22-
memory amount label instead of a read value. GPU_MEMORY_RESERVED value will be
22+
amount, the `GPU_MEMORY_OVERRIDE` environment variable value is turned into a GPU
23+
memory amount label instead of a read value. `GPU_MEMORY_RESERVED` value will be
2324
scoped out from the GPU memory amount found from sysfs.
25+
26+
## Default labels
27+
28+
Following labels are created by default. You may turn numeric labels into extended resources with NFD.
29+
30+
name | type | description|
31+
-----|------|------|
32+
|`gpu.intel.com/millicores`| number | node GPU count * 1000. Can be used as a finer grained shared execution fraction.
33+
|`gpu.intel.com/memory.max`| number | sum of detected [GPU memory amounts](#GPU-memory) in bytes OR environment variable value * GPU count
34+
|`gpu.intel.com/cards`| string | list of card names separated by '`.`'. The names match host `card*`-folders under `/sys/class/drm/`.
35+
36+
## Capability labels (optional)
37+
38+
Capability labels are created from information found inside debugfs, and therefore
39+
unfortunately require running the NFD worker as root. Due to coming from debugfs,
40+
which is not guaranteed to be stable, these are not guaranteed to be stable either.
41+
If you don't need these, simply do not run NFD worker as root, that is also more secure.
42+
Depending on your kernel driver, running the NFD hook as root may introduce following labels:
43+
44+
name | type | description|
45+
-----|------|------|
46+
|`gpu.intel.com/platform_gen`| string | GPU platform generation name, typically a number.
47+
|`gpu.intel.com/platform_<PLATFORM_NAME>_.count`| number | GPU count for the named platform.
48+
|`gpu.intel.com/platform_<PLATFORM_NAME>_.tiles`| number | GPU tile count in the GPUs of the named platform.
49+
|`gpu.intel.com/platform_<PLATFORM_NAME>_.present`| string | "true" for indicating the presense of the GPU platform.
50+
51+
For the above to work as intended, installed GPUs must be identical in their capabilities.

cmd/gpu_plugin/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ daemonset.apps/intel-gpu-plugin created
146146

147147
Usage of fractional GPU resources, such as GPU memory, requires that the cluster has node
148148
extended resources with the name prefix `gpu.intel.com/`. Those can be created with NFD
149-
by running the hook installed by the plugin initcontainer. When fractional resources are
149+
by running the [hook](/cmd/gpu_nfdhook/) installed by the plugin initcontainer. When fractional resources are
150150
enabled, the plugin lets a [scheduler extender](https://github.com/intel/platform-aware-scheduling/tree/master/gpu-aware-scheduling)
151151
do card selection decisions based on resource availability and the amount of extended
152152
resources requested in the [pod spec](https://github.com/intel/platform-aware-scheduling/blob/master/gpu-aware-scheduling/docs/usage.md#pods).

0 commit comments

Comments
 (0)