gguf: parser for GGUF filename variants (LoRA / vocab / MTP / imatrix)#2171
Open
mishig25 wants to merge 1 commit into
Open
gguf: parser for GGUF filename variants (LoRA / vocab / MTP / imatrix)#2171mishig25 wants to merge 1 commit into
mishig25 wants to merge 1 commit into
Conversation
Adds parseGGUFFileVariant for filename-level GGUF variants: - LoRA, vocab — spec <Type> slot values (GGUF naming spec) - MTP — spec <Type> slot value (ggml-org/ggml#1488, in-flight) - imatrix — community marker for importance-matrix calibration; not part of the spec <Type> slot but appears widely Returns a GGUFFileVariant[] in first-occurrence order with exact duplicates deduped. Plain model files return []. Behaviour: - delimited token match: token must be bounded by ^, -, or . on each side (so substrings inside other tokens — "imatrixed", "MTPiston" — don't false-match); - case-insensitive match, canonical case returned; - works for tokens in any position (prefix, middle, suffix) since real- world filenames put MTP / imatrix on either side of the encoding; - `i1-` prefix is intentionally NOT recognized. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
julien-c
approved these changes
May 13, 2026
Comment on lines
+358
to
+359
| expect(parseGGUFFileVariant("Model-IQ2_XXS-imatrix.gguf")).toEqual(["imatrix"]); | ||
| expect(parseGGUFFileVariant("Model-imatrix-Q4_K_M.gguf")).toEqual(["imatrix"]); // prefix position |
Member
There was a problem hiding this comment.
are you sure we want to support both orders? the spec defines one canonical order no?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds
parseGGUFFileVariantfor filename-level GGUF variants. Today nothing in the SDK parses the<Type>slot at all — there are parsers for<Encoding>(parseGGUFQuantLabel) and<Shard>(parseGgufShardFilename) but not for the rest.API
Returns variants in first-occurrence order, with exact duplicates deduped.
[]for plain model files.Model-Q4_K_M.gguf[]Model-Q4_K_M-LoRA.gguf["LoRA"]Qwen3.6-27B-MTP-Q8_0.gguf["MTP"]Model-IQ2_XXS-imatrix.gguf["imatrix"]Model-MTP-imatrix-Q4_K_M.gguf["MTP", "imatrix"]Model-imatrix-Q4_K_M-imatrix.gguf["imatrix"](dedup)DeepSeek-V3.i1-Q4_K_M.gguf[](i1-not recognized)Llama-imatrixed-Q4_K_M.gguf[](substring inside another token)Design choices
<Type>token and animatrixmarker (e.g.,Model-MTP-imatrix-Q4_K_M.gguf). An array preserves that without collapsing axes.imatrixincluded alongside spec values even though it's not part of the spec<Type>slot. It's a widespread community marker (e.g., antirez/deepseek-v4-gguf uses-imatrixsuffix) and consumers usually want both surfaced from one call.i1-prefix NOT recognized. mradermacher'si1-is the only filename heuristic for imatrix that uses a different shape; IQ-quants don't actually imply imatrix and Q-quants can use it too, so we deliberately don't expand the matcher to cover heuristic markers — only explicitimatrixtokens.^,-,.on each side, case-insensitive) so substrings likeimatrixedorMTPistondon't false-match.MTPis anticipated by ggml-org/ggml#1488, which is still in flight.LoRAandvocabare long-established in the spec.Tests
Covers: empty array for plain models, all four variant values, multi-token preservation, dedup, case-insensitive match with canonical-case return,
.separator, path prefix stripping, explicit non-match fori1-, and substring false-positive guards (imatrixed,MTPiston).Note
Medium Risk
Medium risk because it introduces new filename parsing logic (regex with lookbehind and global matching) and re-exports it as part of the public
ggufAPI, which could have runtime compatibility or edge-case parsing impacts.Overview
Adds
parseGGUFFileVariant(andGGUFFileVariant) to extract filename-level GGUF variant tokens likeLoRA,vocab,MTP, andimatrix, returning a deduped list in first-occurrence order with delimiter- and case-insensitive matching.Re-exports the new parser/type via
packages/gguf/src/gguf.ts, and expandsgguf.spec.tswith coverage for ordering, deduping, delimiter handling (-/.), path stripping, and false-positive guards.Reviewed by Cursor Bugbot for commit 47ad4c3. Bugbot is set up for automated code reviews on this repo. Configure here.