Eval bug: Qwen3 Q4_0 not working with SYCL #13163

invent00 · 2025-04-29T02:37:26Z

Name and Version

version: 5215(5f5e39e)
built with MSVC 19343.34808.0

Operating systems

Windows

GGML backends

SYCL

Hardware

Core Ultra 5 125U 32GB mem(ThinkPad X1 Carbon Gen12)
Driver Version: 32.0.101.6739

Models

Qwen3-4B-gguf Q4_0 (https://huggingface.co/unsloth/Qwen3-4B-GGUF/tree/main)

Problem description & steps to reproduce

When attempting inference with the model, the screen briefly goes black and fails to function properly. However, the Q4_K_M model operates normally.

in addition, cuda build (cu11.7,b5215) work properly with Q4_0.

how to reproduce:

llama-cli.exe -ngl 99 -m Qwen3-4B-Q4_0.gguf
input question
Black out occur

in event log, llama-cli.exe shows following application error

First Bad Commit

No response

Relevant log output

1. llama-cli.exe -ngl 99 -m Qwen3-4B-Q4_0.gguf
2. input question
3. Black out occur

The text was updated successfully, but these errors were encountered:

Sketchfellow · 2025-04-29T03:12:22Z

I've also had issues with Q4_0 quants on SYCL resulting in the screen going black and crashing for my Arc A770M. I experienced this on Gemma 3 12B QAT as well as Llama 2 7B when running performance benchmarks. I believe the SYCL Q4_0 reorder optimizations resulted in this as setting GGML_SYCL_DISABLE_OPT=1 allowed things to run normally again.

qnixsynapse · 2025-04-29T03:43:54Z

I believe the SYCL Q4_0 reorder optimizations resulted in this as setting GGML_SYCL_DISABLE_OPT=1 allowed things to run normally again

cc @Rbiessy @NeoZhangJianyu @Alcpz ^

invent00 · 2025-04-29T05:36:56Z

Hi @Sketchfellow,
Thank you for your advice. after set GGML_SYCL_DISABLE_OPT=1 , it works properly.

Alcpz · 2025-04-30T14:21:07Z

I've been able to reproduce the issue, but only on Windows. Linux seems unaffected. As reported, GGML_SYCL_DISABLE_OPT=1 works without problem. There seems to be something wrong with the reorder, but I would need to have a deeper look at it.

NeoZhangJianyu · 2025-05-05T06:06:48Z

Let me check!

sgeor255 · 2025-05-06T14:45:34Z

@invent00 #13109 should fix this issue. Could you check if it works for you? :) Note that you will need to set/export the environment variable GGML_SYCL_DISABLE_OPT=0 to trigger the reorder codepath which was causing the issue.

invent00 · 2025-05-08T14:27:48Z

@sgeor255 Hi, I builded d7e5179
and tried. it works properly with GGML_SYCL_DISABLE_OPT=0 + Qwen3-4B-Q4_0.gguf.

I confirmed GGML_SYCL_DISABLE_OPT=0 is faster than GGML_SYCL_DISABLE_OPT=1.

Once this is merged into main, I will close this issue.

invent00 · 2025-05-16T12:01:07Z

Hi,

I confirmed works properly on version:5402 (0a338ed)
with GGML_SYCL_DISABLE_OPT=0 also works properly.

Let me close issue. Thank you for your support.

invent00 added the bug-unconfirmed label Apr 29, 2025

invent00 changed the title ~~Eval bug:~~ Eval bug: Qwen3 Q4_0 not working with SYCL Apr 29, 2025

LarchLiu mentioned this issue Apr 29, 2025

[Bug]: Qwen 3 0.6B and 1.4B crash upon loading model a-ghorbani/pocketpal-ai#279

Closed

qnixsynapse mentioned this issue May 2, 2025

SYCL: Disable reorder optimize by default and stop setting tensor extras when optimize is disabled #13254

Merged

sgeor255 mentioned this issue May 6, 2025

sycl : Implemented reorder Q4_K mmvq #13109

Merged

invent00 closed this as completed May 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: Qwen3 Q4_0 not working with SYCL #13163

Eval bug: Qwen3 Q4_0 not working with SYCL #13163

invent00 commented Apr 29, 2025 •

edited

Loading

Sketchfellow commented Apr 29, 2025

qnixsynapse commented Apr 29, 2025 •

edited

Loading

invent00 commented Apr 29, 2025

Alcpz commented Apr 30, 2025

NeoZhangJianyu commented May 5, 2025

sgeor255 commented May 6, 2025

invent00 commented May 8, 2025 •

edited

Loading

invent00 commented May 16, 2025

Eval bug: Qwen3 Q4_0 not working with SYCL #13163

Eval bug: Qwen3 Q4_0 not working with SYCL #13163

Comments

invent00 commented Apr 29, 2025 • edited Loading

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Sketchfellow commented Apr 29, 2025

qnixsynapse commented Apr 29, 2025 • edited Loading

invent00 commented Apr 29, 2025

Alcpz commented Apr 30, 2025

NeoZhangJianyu commented May 5, 2025

sgeor255 commented May 6, 2025

invent00 commented May 8, 2025 • edited Loading

invent00 commented May 16, 2025

invent00 commented Apr 29, 2025 •

edited

Loading

qnixsynapse commented Apr 29, 2025 •

edited

Loading

invent00 commented May 8, 2025 •

edited

Loading