-
Notifications
You must be signed in to change notification settings - Fork 451
Open
Description
I am running multiple Docker containers using the following command, which exposes all GPU devices to each container:
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Scenario:
Container C1 runs Process P1, which internally needs to use GPU IDs 1, 2, 3, and 4.
Container C2 runs Process P2 at the same time.
Since both containers receive access to all GPU devices, I want to understand the expected behavior:
Question
Is it possible that P2 in C2 also selects GPUs 1, 2, 3, and 4, leading to GPU resource contention, crashes, or failed inference/training runs?
Metadata
Metadata
Assignees
Labels
No labels