Replies: 1 comment 1 reply
-
I get |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I was computing some conversion of Gemma 2 2b in GGUF (llama.cpp b3496 release) with the following code and wikitext-2-raw/wiki.test.raw as dataset:
``
git clone https://huggingface.co/google/gemma-2-2b
cd llama.cpp
python ./llama.cpp/convert_hf_to_gguf.py gemma-2-2b --outtype f32 --outfile gemma-2-2b.FP32.gguf
python ./llama.cpp/convert_hf_to_gguf.py gemma-2-2b --outtype q8_0 --outfile gemma-2-2b-Q8_0.gguf
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q6_K.gguf Q6_K
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q5_K_M.gguf Q5_K_M
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q5_K_S.gguf Q5_K_S
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q4_K_M.gguf Q4_K_M
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q4_K_S.gguf Q4_K_S
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q3_K_L.gguf Q3_K_L
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q3_K_M.gguf Q3_K_M
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q3_K_S.gguf Q3_K_S
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q2_K.gguf Q2_K
./llama-perplexity -m ../gemma-2-2b.FP32.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q8_0.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q6_K.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q5_K_M.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q5_K_S.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q4_K_M.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q4_K_S.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q3_K_L.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q3_K_M.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q3_K_S.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q2_K.gguf -f ../wikitext-2-raw/wiki.test.raw
``
And the results that I am obtaining for the FP32 version are very strange... they are the worst. For the other versions, the numbers look OK, I am, as expected a strong reduction in perplexity.
Beta Was this translation helpful? Give feedback.
All reactions