Autoregressive LLMs as text encoders? #653

stduhpf · 2025-04-11T11:56:57Z

stduhpf
Apr 11, 2025

With models like Lumina 2.0 and HiDream I1, the future of diffusion models seems to be to use autoregressive LLMs (GPTs) as text encoders, for example Google Gemma2 for Lumina or Meta Llama3 for HiDream. These models are already very well supported in llama.cpp, So I'm wondering what should be the way to support them.
Should llama.cpp be included as a submodule? (this could maybe help T5 run better on GPU too) Or should sdcpp re-implement these models from scratch?

Green-Sky · 2025-04-11T13:19:52Z

Green-Sky
Apr 11, 2025

Also, the way llama.cpp is moving, it is accumulating more and more other features, like clip and other embeddings, tts with audio en/decoder... just a matter of time before vae and diffusion sampling becomes a desired feature of llama.cpp .

2 replies

wandbrandon Apr 21, 2025

if diffusion were to be part of llama cpp, will that make this obsolete?

Green-Sky Apr 21, 2025

To a degree. This code base is different to what would fit into llama.cpp .
See how clip.cpp got integrated and the original repo is mostly un-maintained now, with the clip.cpp contained within llama.cpp being more feature full.

SkutteOleg · 2025-04-11T13:40:43Z

SkutteOleg
Apr 11, 2025

Or should sdcpp re-implement these models from scratch?

Reimplementing from scratch definitely should be avoided in my opinion. I think koboldcpp is positioned well

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autoregressive LLMs as text encoders? #653

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Autoregressive LLMs as text encoders? #653

stduhpf Apr 11, 2025

Replies: 2 comments · 2 replies

Green-Sky Apr 11, 2025

wandbrandon Apr 21, 2025

Green-Sky Apr 21, 2025

SkutteOleg Apr 11, 2025

stduhpf
Apr 11, 2025

Replies: 2 comments 2 replies

Green-Sky
Apr 11, 2025

SkutteOleg
Apr 11, 2025