How to run Llama3-8b instruct model on multiple GPUs? #7086

aitechguy0105 · 2024-05-05T08:54:26Z

aitechguy0105
May 5, 2024

Deos llama.cpp support llama3-8b to run on multiple GPUs?

gardner · 2024-05-06T09:59:59Z

gardner
May 6, 2024

Why would you run it on multiple GPUs?

0 replies

aitechguy0105 · 2024-05-06T10:36:40Z

aitechguy0105
May 6, 2024
Author

to test benchmark. other models seems to be supported on multiple GPUs.

1 reply

gardner May 7, 2024

It only actually runs inference on one GPU at a time. You get the benefit of the additional VRAM but not the benefit of the additional processing power. Llama 3 8B should work on multiple GPUs. Your question doesn't really provide a lot of context. To maximize the possibility of someone being able to understand your issue and to help you, please review this article on How To Ask Questions The Smart Way.

aknirala · 2025-01-13T23:02:00Z

aknirala
Jan 13, 2025

One reason for trying to run on multiple GPUs, as in my case is, it does not fit on one. I have 2 Nividia 2080 tI, each has around 11GB of memory. By default it choses GPU:0. I was wondering if I can parallelize between the two GPUs to run it.

1 reply

gardner Jan 14, 2025

The Q4_K_M quant is about 4.92 GB. It should fit in 11 GB of VRAM with space left for inference.

It sounds like you might be interested in this discussion: #8725

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to run Llama3-8b instruct model on multiple GPUs? #7086

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to run Llama3-8b instruct model on multiple GPUs? #7086

Uh oh!

aitechguy0105 May 5, 2024

Replies: 3 comments · 2 replies

Uh oh!

gardner May 6, 2024

Uh oh!

aitechguy0105 May 6, 2024 Author

Uh oh!

gardner May 7, 2024

Uh oh!

aknirala Jan 13, 2025

Uh oh!

gardner Jan 14, 2025

aitechguy0105
May 5, 2024

Replies: 3 comments 2 replies

gardner
May 6, 2024

aitechguy0105
May 6, 2024
Author

aknirala
Jan 13, 2025