-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Eval bug: Qwen3 Q4_0 not working with SYCL #13163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've also had issues with Q4_0 quants on SYCL resulting in the screen going black and crashing for my Arc A770M. I experienced this on Gemma 3 12B QAT as well as Llama 2 7B when running performance benchmarks. I believe the SYCL Q4_0 reorder optimizations resulted in this as setting |
|
Hi @Sketchfellow, |
I've been able to reproduce the issue, but only on Windows. Linux seems unaffected. As reported, |
Let me check! |
Hi, I confirmed works properly on version:5402 (0a338ed) Let me close issue. Thank you for your support. |
Name and Version
version: 5215(5f5e39e)
built with MSVC 19343.34808.0
Operating systems
Windows
GGML backends
SYCL
Hardware
Core Ultra 5 125U 32GB mem(ThinkPad X1 Carbon Gen12)
Driver Version: 32.0.101.6739
Models
Qwen3-4B-gguf Q4_0 (https://huggingface.co/unsloth/Qwen3-4B-GGUF/tree/main)
Problem description & steps to reproduce
When attempting inference with the model, the screen briefly goes black and fails to function properly. However, the Q4_K_M model operates normally.
in addition, cuda build (cu11.7,b5215) work properly with Q4_0.
how to reproduce:
in event log, llama-cli.exe shows following application error

First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: