gptq量化 Qwen2.5-VL-3B时报错torch._C._LinAlgError #3046

ggysl · 2025-02-10T11:27:49Z

Describe the bug
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程，最好有截图)

CUDA_VISIBLE_DEVICES=1 swift export \
    --model path/to/ckp \
    --quant_bits 4 \
    --load_data_args true \
    --attn_impl flash_attn \
    --quant_n_samples 256 \
    --quant_method gptq

报错

Traceback (most recent call last):                                                                                                                                                                                                                                                                          
  File "/workspace/swtogo/ms-swift/swift/cli/export.py", line 5, in <module>
    export_main()
  File "/workspace/swtogo/ms-swift/swift/llm/export/export.py", line 44, in export_main
    return SwiftExport(args).main()
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/swtogo/ms-swift/swift/llm/base.py", line 46, in main
    result = self.run()
             ^^^^^^^^^^
  File "/workspace/swtogo/ms-swift/swift/llm/export/export.py", line 29, in run
    quantize_model(args)
  File "/workspace/swtogo/ms-swift/swift/llm/export/quant.py", line 213, in quantize_model
    QuantEngine(args).quantize()
  File "/workspace/swtogo/ms-swift/swift/llm/export/quant.py", line 43, in quantize
    gptq_quantizer = self.gptq_model_quantize()
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/swtogo/ms-swift/swift/llm/export/quant.py", line 207, in gptq_model_quantize
    gptq_quantizer.quantize_model(self.model, self.tokenizer)
  File "/opt/conda/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/optimum/gptq/quantizer.py", line 640, in quantize_model
    quant_outputs = gptq[name].fasterquant(
                    ^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/swtogo/AutoGPTQ/auto_gptq/quantization/gptq.py", line 116, in fasterquant
    H = torch.linalg.cholesky(H)
        ^^^^^^^^^^^^^^^^^^^^^^^^
torch._C._LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 1 is not positive-definite).

Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息，如CUDA版本，系统，GPU型号和torch版本等)

显卡 RTX3090

pytorch                           2.5.1
cuda                               12.1

ms-swift                          3.1.0.dev0
transformers                   4.49.0.dev0
auto_gptq                       0.7.1
optimum                        1.25.0.dev0
flash_attn                        2.7.4.post1

Additional context
Add any other context about the problem here(在这里补充其他信息)

相同配置可量化Qwen2-VL没有问题，在Qwen2.5-VL时报错。

The text was updated successfully, but these errors were encountered:

ggysl · 2025-02-24T11:05:53Z

已解决，安装了GPTQModel时更新了numpy 到2.2.3，问题解决，还是版本的问题

ggysl closed this as completed Feb 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gptq量化 Qwen2.5-VL-3B时报错torch._C._LinAlgError #3046

gptq量化 Qwen2.5-VL-3B时报错torch._C._LinAlgError #3046

ggysl commented Feb 10, 2025

ggysl commented Feb 24, 2025

gptq量化 Qwen2.5-VL-3B时报错torch._C._LinAlgError #3046

gptq量化 Qwen2.5-VL-3B时报错torch._C._LinAlgError #3046

Comments

ggysl commented Feb 10, 2025

ggysl commented Feb 24, 2025