Skip to content

gptq量化 Qwen2.5-VL-3B时报错torch._C._LinAlgError #3046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ggysl opened this issue Feb 10, 2025 · 1 comment
Closed

gptq量化 Qwen2.5-VL-3B时报错torch._C._LinAlgError #3046

ggysl opened this issue Feb 10, 2025 · 1 comment

Comments

@ggysl
Copy link

ggysl commented Feb 10, 2025

Describe the bug
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图)

CUDA_VISIBLE_DEVICES=1 swift export \
    --model path/to/ckp \
    --quant_bits 4 \
    --load_data_args true \
    --attn_impl flash_attn \
    --quant_n_samples 256 \
    --quant_method gptq

报错

Traceback (most recent call last):                                                                                                                                                                                                                                                                          
  File "/workspace/swtogo/ms-swift/swift/cli/export.py", line 5, in <module>
    export_main()
  File "/workspace/swtogo/ms-swift/swift/llm/export/export.py", line 44, in export_main
    return SwiftExport(args).main()
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/swtogo/ms-swift/swift/llm/base.py", line 46, in main
    result = self.run()
             ^^^^^^^^^^
  File "/workspace/swtogo/ms-swift/swift/llm/export/export.py", line 29, in run
    quantize_model(args)
  File "/workspace/swtogo/ms-swift/swift/llm/export/quant.py", line 213, in quantize_model
    QuantEngine(args).quantize()
  File "/workspace/swtogo/ms-swift/swift/llm/export/quant.py", line 43, in quantize
    gptq_quantizer = self.gptq_model_quantize()
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/swtogo/ms-swift/swift/llm/export/quant.py", line 207, in gptq_model_quantize
    gptq_quantizer.quantize_model(self.model, self.tokenizer)
  File "/opt/conda/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/optimum/gptq/quantizer.py", line 640, in quantize_model
    quant_outputs = gptq[name].fasterquant(
                    ^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/swtogo/AutoGPTQ/auto_gptq/quantization/gptq.py", line 116, in fasterquant
    H = torch.linalg.cholesky(H)
        ^^^^^^^^^^^^^^^^^^^^^^^^
torch._C._LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 1 is not positive-definite).

Image

Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等)

显卡 RTX3090

pytorch                           2.5.1
cuda                               12.1

ms-swift                          3.1.0.dev0
transformers                   4.49.0.dev0
auto_gptq                       0.7.1
optimum                        1.25.0.dev0
flash_attn                        2.7.4.post1

Additional context
Add any other context about the problem here(在这里补充其他信息)

相同配置可量化Qwen2-VL没有问题,在Qwen2.5-VL时报错。

@ggysl
Copy link
Author

ggysl commented Feb 24, 2025

已解决,安装了GPTQModel时更新了numpy 到2.2.3,问题解决,还是版本的问题

@ggysl ggysl closed this as completed Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant