Replies: 2 comments
-
Ah, me, searched 1.58bit in huggingface... there's a method related it. But ...I compared a lot, and selected a smaller model. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Is there a reason to use q4_0 specifically? I mean, it's ok but it's old. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
After applying quantization with q4_0, I noticed that the performance of my generated results has declined. I appreciate the speed and reduced VRAM usage that this quantized model offers, but I am seeking ways to enhance its performance. Could you please suggest any solutions or improvements, such as calibration, LoRA, fine-tuning, or other techniques? Thank you for your assistance!
Beta Was this translation helpful? Give feedback.
All reactions