We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FROM runpod/pytorch:2.2.1-py3.10-cuda12.1.1-devel-ubuntu22.04
using this docker file and running
import inference.models.yolo_world.yolo_world YOLO = inference.models.yolo_world.yolo_world.YOLOWorld(model_id="yolo_world/l")
causes the following error:
UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:109.) Creating inference sessions UserWarning: Specified provider 'OpenVINOExecutionProvider' is not in available provider names.Available providers: 'TensorrtExecutionProvider, CUDAExecutionProvider, CPUExecutionProvider' EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 804: forward compatibility was attempted on non supported HW ; GPU=-593199125 ; hostname=0a84033fcf95 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=238 ; expr=cudaSetDevice(info_.device_id); when using ['CUDAExecutionProvider', 'OpenVINOExecutionProvider', 'CPUExecutionProvider'] Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying. Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 383, in __init__ self._create_inference_session(providers, provider_options, disabled_optimizers) File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 435, in _create_inference_session sess.initialize_session(providers, provider_options, disabled_optimizers) RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 804: forward compatibility was attempted on non supported HW ; GPU=-593199125 ; hostname=0a84033fcf95 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=238 ; expr=cudaSetDevice(info_.device_id); The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/app/scripts/temp.py", line 4, in <module> YOLO = inference.models.yolo_world.yolo_world.YOLOWorld(model_id="yolo_world/l") File "/usr/local/lib/python3.10/dist-packages/inference/models/yolo_world/yolo_world.py", line 54, in __init__ clip_model = Clip(model_id="clip/ViT-B-32") File "/usr/local/lib/python3.10/dist-packages/inference/models/clip/clip_model.py", line 65, in __init__ self.visual_onnx_session = onnxruntime.InferenceSession( File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 394, in __init__ raise fallback_error from e File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 389, in __init__ self._create_inference_session(self._fallback_providers, None) File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 435, in _create_inference_session sess.initialize_session(providers, provider_options, disabled_optimizers) RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 804: forward compatibility was attempted on non supported HW ; GPU=-593199125 ; hostname=0a84033fcf95 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=238 ; expr=cudaSetDevice(info_.device_id);
The same python script using
FROM pytorch/pytorch:2.2.2-cuda12.1-cudnn8-runtime
works as expected.
nvidia-smi Mon Jun 3 22:59:43 2024 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A | | N/A 63C P0 25W / 80W | 1538MiB / 8192MiB | 94% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| +-----------------------------------------------------------------------------+
docker-compose.yaml
deploy: resources: reservations: devices: - driver: nvidia count: 1 capabilities: [ gpu ]
The text was updated successfully, but these errors were encountered:
bump, same thing happening with latest image as well runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04.
runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04
@patrickwasp were you able to figure it out?
Sorry, something went wrong.
No branches or pull requests
FROM runpod/pytorch:2.2.1-py3.10-cuda12.1.1-devel-ubuntu22.04
using this docker file and running
causes the following error:
The same python script using
FROM pytorch/pytorch:2.2.2-cuda12.1-cudnn8-runtime
works as expected.
docker-compose.yaml
The text was updated successfully, but these errors were encountered: