Replies: 2 comments 2 replies
-
Also, the way llama.cpp is moving, it is accumulating more and more other features, like clip and other embeddings, tts with audio en/decoder... just a matter of time before vae and diffusion sampling becomes a desired feature of llama.cpp . |
Beta Was this translation helpful? Give feedback.
2 replies
-
Reimplementing from scratch definitely should be avoided in my opinion. I think koboldcpp is positioned well |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
With models like Lumina 2.0 and HiDream I1, the future of diffusion models seems to be to use autoregressive LLMs (GPTs) as text encoders, for example Google Gemma2 for Lumina or Meta Llama3 for HiDream. These models are already very well supported in llama.cpp, So I'm wondering what should be the way to support them.
Should llama.cpp be included as a submodule? (this could maybe help T5 run better on GPU too) Or should sdcpp re-implement these models from scratch?
Beta Was this translation helpful? Give feedback.
All reactions