Unloading vision model of VLMs for Exllamav3 backend #393

mefich · 2025-10-30T08:00:22Z

This is a small PR that ensures that vision model of a VLM is unloaded and doesn't stay in VRAM indefinitely.

I've used a few exl3 VLMs and noticed that after unloading them a noticeable amount of VRAM was kept reserved by TabbyAPI.
Exl3 backend unload function was missing code to unload the vision part.

This change ensures that when unloading vlm their vision part is also unloaded.

Update exl3 backend model.py: fix for unloading vision models

37aea9d

This change ensures that when unloading vlm their vision part is also unloaded.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Unloading vision model of VLMs for Exllamav3 backend #393

Unloading vision model of VLMs for Exllamav3 backend #393

Uh oh!

mefich commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Unloading vision model of VLMs for Exllamav3 backend #393

Are you sure you want to change the base?

Unloading vision model of VLMs for Exllamav3 backend #393

Uh oh!

Conversation

mefich commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant