Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile with TORCH_USE_CUDA_DSA to enable device-side assertions. #162

Open
chensyo opened this issue Mar 8, 2025 · 7 comments
Open

Compile with TORCH_USE_CUDA_DSA to enable device-side assertions. #162

chensyo opened this issue Mar 8, 2025 · 7 comments

Comments

@chensyo
Copy link

chensyo commented Mar 8, 2025

新版本0.14更新后报这个错误,之前0.13是可以正常使用的,python3.11,torch2.60+cuda12.6
!!! Exception during processing !!! CUDA error: operation not supported
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Traceback (most recent call last):
File "D:\Program Files\ComfyUI\execution.py", line 327, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Program Files\ComfyUI\execution.py", line 202, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Program Files\ComfyUI\execution.py", line 174, in _map_node_over_list
process_inputs(input_dict, i)
File "D:\Program Files\ComfyUI\execution.py", line 163, in process_inputs
results.append(getattr(obj, func)(**inputs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Program Files\ComfyUI\custom_nodes\svdquant\nodes\models\flux.py", line 134, in load_model
transformer = transformer.to(device)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\diffusers\models\modeling_utils.py", line 1077, in to
return super().to(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py", line 1343, in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py", line 903, in _apply
module._apply(fn)
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py", line 903, in _apply
module._apply(fn)
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py", line 903, in _apply
module._apply(fn)
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py", line 930, in _apply
param_applied = fn(param)
^^^^^^^^^
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py", line 1329, in convert
return t.to(
^^^^^
RuntimeError: CUDA error: operation not supported
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Prompt executed in 5.77 seconds
FETCH ComfyRegistry Data: 10/36
FETCH ComfyRegistry Data: 15/36
got prompt
[2025-03-08 21:26:51.585] [info] Initializing QuantizedFluxModel
[2025-03-08 21:26:52.465] [info] Loading weights from D:\Program Files\ComfyUI\models\diffusion_models\svdq-int4-flux.1-schnell\transformer_blocks.safetensors
[2025-03-08 21:26:52.466] [warning] Failed to load safetensors using method MIO: CUDA error: operation not supported (at D:\Program Files\BuildWhl\nunchaku\src\Serialization.cpp:130)
[2025-03-08 21:26:57.272] [info] Done.
!!! Exception during processing !!! CUDA error: operation not supported
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Traceback (most recent call last):
File "D:\Program Files\ComfyUI\execution.py", line 327, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Program Files\ComfyUI\execution.py", line 202, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Program Files\ComfyUI\execution.py", line 174, in _map_node_over_list
process_inputs(input_dict, i)
File "D:\Program Files\ComfyUI\execution.py", line 163, in process_inputs
results.append(getattr(obj, func)(**inputs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Program Files\ComfyUI\custom_nodes\svdquant\nodes\models\flux.py", line 134, in load_model
transformer = transformer.to(device)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\diffusers\models\modeling_utils.py", line 1077, in to
return super().to(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py", line 1343, in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py", line 903, in _apply
module._apply(fn)
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py", line 903, in _apply
module._apply(fn)
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py", line 903, in _apply
module._apply(fn)
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py", line 930, in _apply
param_applied = fn(param)
^^^^^^^^^
File "C:\Users\Mingzhenwang\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py", line 1329, in convert
return t.to(
^^^^^
RuntimeError: CUDA error: operation not supported
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

@sxtyzhangzk
Copy link
Collaborator

Could you check if the latest commit fixes this issue? Thanks.

@sxtyzhangzk sxtyzhangzk reopened this Mar 8, 2025
@chensyo
Copy link
Author

chensyo commented Mar 8, 2025

git pull 最新的main,看到应该是更新了common.h,重新编译安装了一下,仍然报错
RuntimeError: CUDA error: operation not supported
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

@hdfhssg
Copy link
Contributor

hdfhssg commented Mar 8, 2025

git pull 最新的main,看到应该是更新了common.h,重新编译安装了一下,仍然报错 RuntimeError: CUDA error: operation not supported CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with to enable device-side assertions.TORCH_USE_CUDA_DSA

你是Windows10吧,我Windows10也有这个报错,我修改文件已经解决它了,我再测试测试没问题我把我改的放上来吧,或者你等他们官方看看有没有更好的修改方式

@sxtyzhangzk
Copy link
Collaborator

git pull 最新的main,看到应该是更新了common.h,重新编译安装了一下,仍然报错 RuntimeError: CUDA error: operation not supported CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Could you try setting the environment variable NUNCHAKU_LOAD_METHOD?
Try
set NUNCHAKU_LOAD_METHOD=READ
or
set NUNCHAKU_LOAD_METHOD=READNOPIN

@chensyo
Copy link
Author

chensyo commented Mar 8, 2025

git pull 最新的main,看到应该是更新了common.h,重新编译安装了一下,仍然报错 RuntimeError: CUDA error: operation not supported CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with to enable device-side assertions.TORCH_USE_CUDA_DSA

你是Windows10吧,我Windows10也有这个报错,我修改文件已经解决它了,我再测试测试没问题我把我改的放上来吧,或者你等他们官方看看有没有更好的修改方式

是的,我是win10.之前0.1.3版本报那个Unable to pin memory: operation not supported错误然后闪退,后面拿你改的Serialization.cpp编译完虽然也报错但是可以正常使用,现在更新了又不行了

@hdfhssg
Copy link
Contributor

hdfhssg commented Mar 8, 2025

git pull 最新的main,看到应该是更新了common.h,重新编译安装了一下,仍然报错 RuntimeError: CUDA error: operation not supported CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with to enable device-side assertions.TORCH_USE_CUDA_DSA

Could you try setting the environment variable ? Try orNUNCHAKU_LOAD_METHOD``set NUNCHAKU_LOAD_METHOD=READ set NUNCHAKU_LOAD_METHOD=READNOPIN

OK, I'll try later.

@chensyo
Copy link
Author

chensyo commented Mar 8, 2025

git pull 最新的main,看到应该是更新了common.h,重新编译安装了一下,仍然报错 RuntimeError: CUDA error: operation not supported CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Could you try setting the environment variable NUNCHAKU_LOAD_METHOD? Try set NUNCHAKU_LOAD_METHOD=READ or set NUNCHAKU_LOAD_METHOD=READNOPIN

这个可以,恢复正常了,感谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants