Skip to content

ggml-cpu/riscv: gate cpu-riscv64 backend on Zv* sub-extensions#1475

Open
mikey wants to merge 1 commit into
ggml-org:masterfrom
mikey:fix/hwprobe-rva23s64
Open

ggml-cpu/riscv: gate cpu-riscv64 backend on Zv* sub-extensions#1475
mikey wants to merge 1 commit into
ggml-org:masterfrom
mikey:fix/hwprobe-rva23s64

Conversation

@mikey
Copy link
Copy Markdown

@mikey mikey commented May 2, 2026

The existing runtime hwprobe check only looks at base RVV. When the build emits Zvbb / Zvbc / Zvkb / Zvkn* / Zvfh instructions and the runtime CPU supports RVV but not these addition extensions, the kernels SIGILL rather than falling back to scalar.

This can happen if ggml is compiled for a fully featured RVA23 CPU, but then run on only a baseline RVA23 CPU (like qemu -cpu rva23s64).

Probe each sub-extension via RISCV_HWPROBE_KEY_IMA_EXT_0 and, in the backend score function, refuse to register as cpu-riscv64 (return 0) whenever the binary was compiled __riscv_zvX but the runtime CPU lacks zvX.

Bits 17..31 of IMA_EXT_0 may be missing from older asm/hwprobe.h headers, so the patch ships fallback definitions; on kernels >= 6.5 the values match the upstream definitions.

Claude Opus was used to find this issue and write the initial version.

For changes to the core ggml library (including to the CMake build system), please open a PR in https://github.com/ggml-org/llama.cpp. Doing so will make your PR more visible, better tested and more likely to be reviewed.

The existing runtime hwprobe check only looks at base RVV. When
the build emits Zvbb / Zvbc / Zvkb / Zvkn* / Zvfh instructions and
the runtime CPU supports RVV but not these addition extensions, the
kernels SIGILL rather than falling back to scalar.

This can happen if ggml is compiled for a fully featured RVA23 CPU,
but then run on only a baseline RVA23 CPU (like qemu -cpu rva23s64).

Probe each sub-extension via RISCV_HWPROBE_KEY_IMA_EXT_0 and, in the
backend score function, refuse to register as cpu-riscv64 (return 0)
whenever the binary was compiled __riscv_zvX but the runtime CPU
lacks zvX.

Bits 17..31 of IMA_EXT_0 may be missing from older asm/hwprobe.h
headers, so the patch ships fallback definitions; on kernels >= 6.5
the values match the upstream definitions.

Claude Opus was used to find this issue and write the initial version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant