sync : llama.cpp by ggerganov · Pull Request #1476 · ggml-org/ggml

ggerganov · 2026-05-05T07:34:14Z

No description provided.

* MoE Mxfp4 CLC kernel added, router reorder on GPU * Pass test-backend-ops for MoE mxfp4 Adreno CLC * remove putenv in llama-model.cpp * fix indent style and whitespace * opencl: remove unnecessary headers * opencl: do not save cl_program objects * opencl: remove unnecessary assert * fix precision issue --------- Co-authored-by: Li He <lih@qti.qualcomm.com>

…irely) (llama/22533) * fix: CUDA device PCI bus ID detection for multi-GPU de-dupe * HIP, MUSA macros --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

* shader(norm): add layer norm ops * shader(norm): stablize floating point computation with Kahan summation and handle mixed types * shader(norm): remove the non-contiguous strides * shader(norm): use the original implementation rather than the kahan summation

…2) (llama/22631)

* llama : add option to save memory in device buffers * tests : extend llama-save-load-state

ggerganov · 2026-05-05T10:13:16Z

@taronaeo The python deps are failing to install on the self-hosted runner:

https://github.com/ggml-org/ggml/actions/runs/25366621707/job/74379012178#step:3:155

Not sure how to fix it. Any ideas?

taronaeo · 2026-05-05T16:11:30Z

@taronaeo The python deps are failing to install on the self-hosted runner:

https://github.com/ggml-org/ggml/actions/runs/25366621707/job/74379012178#step:3:155

Not sure how to fix it. Any ideas?

This

ggml/requirements.txt

Line 11 in ac6f7b4

torch~=2.5.1

and this

ggml/requirements.txt

Line 4 in ac6f7b4

torchvision>=0.15.2

are mismatching. The latest version of torchvision deprecated support for torch 2.5.X and since we are forcing torch to match versions within 2.5.X, pip can't find a suitable version to install.

I would suggest that we update and align the version requirements with llama.cpp (i.e., bumping torch version to torch~=2.6.0). Also to apply supply-chain hardening, same as what we did in llama.cpp since we currently are accepting versions greater than 0.15.2 for torchvision.

This problem is isolated to the GGML repository only.

shawngu-quic and others added 10 commits May 5, 2026 10:30

ggml-virtgpu: fix circular dependency in headers (llama/22557)

fc2d051

fix: CUDA device PCI bus ID de-dupe OOMing (ignoring other 3 gpus ent…

6429639

…irely) (llama/22533) * fix: CUDA device PCI bus ID detection for multi-GPU de-dupe * HIP, MUSA macros --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

vulkan: delete dead GGML_VK_MAX_NODES def (llama/22621)

fee68cc

CUDA: use fastdiv for batch index split in get_rows (llama/22650)

f87057e

kleidiai : update to v1.24.0 and use release archive (llama/22549)

13f0caf

ggml : implement fast walsh-hadamard transform for kv rotation (#2135…

e4527bd

…2) (llama/22631)

llama : add option to save memory in device buffers (llama/22679)

a3a3494

* llama : add option to save memory in device buffers * tests : extend llama-save-load-state

sync : llama.cpp

8914191

CISC mentioned this pull request May 5, 2026

Compile bug: b9029 is not compatible with the latest release of ggml 0.10.2 ggml-org/llama.cpp#22698

Closed

ggerganov force-pushed the sync-llama.cpp-26-05-05 branch from 79dc8b5 to 8914191 Compare May 5, 2026 10:12

ggerganov merged commit 5bb7236 into master May 5, 2026
26 of 32 checks passed

ggerganov deleted the sync-llama.cpp-26-05-05 branch May 5, 2026 10:13

taronaeo mentioned this pull request May 6, 2026

requirements: sync requirements.txt with llama.cpp versions #1479

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sync : llama.cpp#1476

sync : llama.cpp#1476
ggerganov merged 10 commits into
masterfrom
sync-llama.cpp-26-05-05

ggerganov commented May 5, 2026

Uh oh!

ggerganov commented May 5, 2026

Uh oh!

Uh oh!

taronaeo commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Conversation

ggerganov commented May 5, 2026

Uh oh!

ggerganov commented May 5, 2026

Uh oh!

Uh oh!

taronaeo commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants