You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The PPL results I obtained are significantly higher than expected. Is there any known issue with Quarot + GPTQ on LLaMA 3 models, or could I be missing some optimization steps?
Any insights or suggestions would be greatly appreciated!
The text was updated successfully, but these errors were encountered:
You should open online_rotate. If there still has a gap, the fast way to reproduce the results is to use the version before August, since we have add lots of new features, which may cause some impacts.
I tested the Quarot + GPTQ method with W4A4 quantization.
For LLaMA 2-7B:
Only Quarot: PPL = 48
Quarot + GPTQ: PPL = 9.8
However, in Table 12, the reported PPL for Quarot + GPTQ (W4A4) is 6.22.
For LLaMA 3.1-8B:
Only Quarot: PPL = 139
Quarot + GPTQ: PPL = 27.4
Below is the config file:
The PPL results I obtained are significantly higher than expected. Is there any known issue with Quarot + GPTQ on LLaMA 3 models, or could I be missing some optimization steps?
Any insights or suggestions would be greatly appreciated!
The text was updated successfully, but these errors were encountered: