Disabling VRAM <-> RAM offloading without --gpu-only #6659

Big-Whoop · 2025-01-31T12:41:44Z

Big-Whoop
Jan 31, 2025

Hi there,

Is there a way to achieve that Comfy treats VRAM like i.e. Ollama does?

I would like to disable offloading to RAM but keep partitial loading for large models. Flux.dev for example doesn't quite fit in my 24 GB VRAM, but it's still faster/almost as fast and the output quality is a lot better than Q8 GGUF or fp8_*, especially when rendering images with text.

Currently Comfy does the following: Load Text Encoders to VRAM -> Use Text encoders -> Offload Text Encoders to RAM -> Particially Load Flux (about 95%, hardly any speed reduction) -> Render -> Offload Flux to RAM and repeat.

At the same time my OS (Linux) caches the models in RAM, too (in the buff/cache memory of the OS) So the models get cached in RAM twice.

What I'd like to see is: Load Text Encoders to VRAM -> Use Text Encoders -> Discard Text from VRAM -> Load Flux Partitially -> Render -> Discard Flux from VRAM / Shared VRAM -> Repeat

That would not only be faster but also more memory efficient.

When I use the the --high-vram or --gpu-only switch I run out of memory on the device.
When I use the --disable-smart-memory switch it unloads models immediately after using them but still offloads them to VRAM.

Thanks in advance,
Peter

ltdrdata · 2025-01-31T13:04:34Z

ltdrdata
Jan 31, 2025
Collaborator

Use --lowvram it will load model partially.

1 reply

Big-Whoop Jan 31, 2025
Author

Hey, that was a blazing fast reply thank you.

But it doesn't quite do what I want. It still loads the models to RAM before (partly) pushing them to VRAM when needed. I would like to see a behavior that only loads the parts of the models into RAM which don't fit in the VRAM and discard the models from both VRAM and RAM entirely when not needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disabling VRAM <-> RAM offloading without --gpu-only #6659

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Disabling VRAM <-> RAM offloading without --gpu-only #6659

Big-Whoop Jan 31, 2025

Replies: 1 comment · 1 reply

ltdrdata Jan 31, 2025 Collaborator

Big-Whoop Jan 31, 2025 Author

Big-Whoop
Jan 31, 2025

Replies: 1 comment 1 reply

ltdrdata
Jan 31, 2025
Collaborator

Big-Whoop Jan 31, 2025
Author