Skip to content

Multi-GPU Utilization Issue with llama-server #11766

Answered by myan-o
Apscg asked this question in Q&A
Discussion options

You must be logged in to vote

Only vertical split is supported. Horizontal split is not supported.

Vertical split

Simply expanding RAM does not make things faster.
pc1->pc2->pc3->...

Horizontal split

Expanding RAM will make it faster, but it will also increase the amount of data transferred.
pc1->pc1->pc1...
pc2->pc2->pc2...
pc3->pc3->pc3...

There are project that support horizontal splitting.
https://github.com/b4rtaz/distributed-llama

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@Apscg
Comment options

Answer selected by Apscg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants