Update files

leikareipa · Dec 30, 2024 · fd3cb5c · fd3cb5c
1 parent bbcba3d
commit fd3cb5c
Show file tree

Hide file tree

Showing 2 changed files with 2 additions and 2 deletions.
diff --git a/blog/comparing-quants-of-qwq-preview-in-ollama/content.md b/blog/comparing-quants-of-qwq-preview-in-ollama/content.md
@@ -164,6 +164,6 @@ For reference, the table includes the unquantized version of QwQ as hosted on Hu
 
 These tests roughly confirm the previously-found bell curve, where Q4_K_M is the apex quant and categorically better than Q8_0. More broadly, the sweet spot appears to be between Q3_K_M and Q5_K_M, inclusive.
 
-Q8_0 was in fact so bad that it indicates a potential problem with Ollama. I don't see many other explanations for it being categorically worse than Q4_K_M <i>and</i> the unquantized version.
+Q8_0 was in fact so bad that it indicates a potential problem with Ollama. Outside of these tests, I've also seen preliminary indications that Ollama's FP16 struggles in a similar way. I opened an issue for this, but it was closed without resolution, so it's not something the Ollama authors are concerned about.
 
 All of that said, my time with QwQ has also shown a fair bit of variance in output quality within any given quant. You'd ideally do very many runs of a test to find a realistic average.
diff --git a/blog/comparing-quants-of-qwq-preview-in-ollama/index.html b/blog/comparing-quants-of-qwq-preview-in-ollama/index.html
@@ -238,7 +238,7 @@
 <p>For reference, the table includes the unquantized version of QwQ as hosted on Hugging Face Playground.</p>
 </dokki-topic><dokki-topic title="Conclusions">
 <p>These tests roughly confirm the previously-found bell curve, where Q4_K_M is the apex quant and categorically better than Q8_0. More broadly, the sweet spot appears to be between Q3_K_M and Q5_K_M, inclusive.</p>
-<p>Q8_0 was in fact so bad that it indicates a potential problem with Ollama. I don't see many other explanations for it being categorically worse than Q4_K_M <i>and</i> the unquantized version.</p>
+<p>Q8_0 was in fact so bad that it indicates a potential problem with Ollama. Outside of these tests, I've also seen preliminary indications that Ollama's FP16 struggles in a similar way. I opened an issue for this, but it was closed without resolution, so it's not something the Ollama authors are concerned about.</p>
 <p>All of that said, my time with QwQ has also shown a fair bit of variance in output quality within any given quant. You'd ideally do very many runs of a test to find a realistic average.</p>
 </dokki-topic>