Load NVIDIA Kernel Modules for JIT-CDI mode#975
Conversation
This change attempts to load the nvidia, nvidia-uvm, and nvidia-modeset kernel modules before generating the automatic (jit) CDI specification. The kernel modules can be controlled by the nvidia-container-runtime.modes.jit-cdi.load-kernel-modules config option. If this is set to the empty list, then no kernel modules are loaded. Errors in loading the kernel modules are logged, but ignored. Signed-off-by: Evan Lezar <elezar@nvidia.com>
What is this good for? |
| mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d" | ||
|
|
||
| [nvidia-container-runtime.modes.jit-cdi] | ||
| load-kernel-modules = ["nvidia", "nvidia-uvm", "nvidia-modeset"] |
There was a problem hiding this comment.
How did we end up making this fine selection? 🍷 🍇 🧀
There was a problem hiding this comment.
These are the kernel modules that are required for various GPU functionalities. In the nvidia-container-cli we do this through nvidia-modprobe:
nvidia: https://github.com/NVIDIA/libnvidia-container/blob/95d3e86522976061e856724867ebcaf75c4e9b60/src/nvc.c#L279nvidia-uvm: https://github.com/NVIDIA/libnvidia-container/blob/95d3e86522976061e856724867ebcaf75c4e9b60/src/nvc.c#L305nvidia-modeset: https://github.com/NVIDIA/libnvidia-container/blob/95d3e86522976061e856724867ebcaf75c4e9b60/src/nvc.c#L314
There was a problem hiding this comment.
This is actually a typo. The module names should be nvidia, nvidia_uvm and nvidia_modeset.
I have updated the description with more motivation. |
| } | ||
|
|
||
| // TODO: Consider moving this into the nvcdi API. | ||
| if err := driver.LoadKernelModules(cfg.NVIDIAContainerRuntimeConfig.Modes.JitCDI.LoadKernelModules...); err != nil { |
There was a problem hiding this comment.
@klueska are there any cases where we DON'T want to load / try to load the kernel modules? Note that we aslo skip this when running in a user namespace in libnvidia-container.
| return []string{"/usr/local/share", "/usr/share"} | ||
| } | ||
|
|
||
| // LoadKmods loads the specified kernel modules in the driver root. |
There was a problem hiding this comment.
| // LoadKmods loads the specified kernel modules in the driver root. | |
| // LoadKernelModules loads the specified kernel modules in the driver root. |
|
Closing this one. Will reopen if required. |
This change attempts to load the nvidia, nvidia-uvm, and nvidia-modeset kernel modules (NVIDIA kernel-mode GPU drivers) before generating the automatic (jit) CDI specification. This aligns the behaviour of the JIT-CDI mode with that of the
nvidia-container-cli.The kernel modules can be controlled by the
nvidia-container-runtime.modes.jit-cdi.load-kernel-modulesconfig option. If this is set to the empty list, then no kernel modules are loaded.
Errors in loading the kernel modules are logged, but ignored.