Skip to content

gguf: parser for GGUF filename variants (LoRA / vocab / MTP / imatrix)#2171

Open
mishig25 wants to merge 1 commit into
huggingface:mainfrom
mishig25:mishig/gguf-file-variant
Open

gguf: parser for GGUF filename variants (LoRA / vocab / MTP / imatrix)#2171
mishig25 wants to merge 1 commit into
huggingface:mainfrom
mishig25:mishig/gguf-file-variant

Conversation

@mishig25
Copy link
Copy Markdown
Collaborator

@mishig25 mishig25 commented May 13, 2026

Adds parseGGUFFileVariant for filename-level GGUF variants. Today nothing in the SDK parses the <Type> slot at all — there are parsers for <Encoding> (parseGGUFQuantLabel) and <Shard> (parseGgufShardFilename) but not for the rest.

API

export type GGUFFileVariant = "LoRA" | "vocab" | "MTP" | "imatrix";
export function parseGGUFFileVariant(fname: string): GGUFFileVariant[];

Returns variants in first-occurrence order, with exact duplicates deduped. [] for plain model files.

Filename Returns
Model-Q4_K_M.gguf []
Model-Q4_K_M-LoRA.gguf ["LoRA"]
Qwen3.6-27B-MTP-Q8_0.gguf ["MTP"]
Model-IQ2_XXS-imatrix.gguf ["imatrix"]
Model-MTP-imatrix-Q4_K_M.gguf ["MTP", "imatrix"]
Model-imatrix-Q4_K_M-imatrix.gguf ["imatrix"] (dedup)
DeepSeek-V3.i1-Q4_K_M.gguf [] (i1- not recognized)
Llama-imatrixed-Q4_K_M.gguf [] (substring inside another token)

Design choices

  • Array, not single value. A file can legitimately carry both a spec <Type> token and an imatrix marker (e.g., Model-MTP-imatrix-Q4_K_M.gguf). An array preserves that without collapsing axes.
  • imatrix included alongside spec values even though it's not part of the spec <Type> slot. It's a widespread community marker (e.g., antirez/deepseek-v4-gguf uses -imatrix suffix) and consumers usually want both surfaced from one call.
  • i1- prefix NOT recognized. mradermacher's i1- is the only filename heuristic for imatrix that uses a different shape; IQ-quants don't actually imply imatrix and Q-quants can use it too, so we deliberately don't expand the matcher to cover heuristic markers — only explicit imatrix tokens.
  • Delimited token match (^, -, . on each side, case-insensitive) so substrings like imatrixed or MTPiston don't false-match.
  • MTP is anticipated by ggml-org/ggml#1488, which is still in flight. LoRA and vocab are long-established in the spec.

Tests

Covers: empty array for plain models, all four variant values, multi-token preservation, dedup, case-insensitive match with canonical-case return, . separator, path prefix stripping, explicit non-match for i1-, and substring false-positive guards (imatrixed, MTPiston).


Note

Medium Risk
Medium risk because it introduces new filename parsing logic (regex with lookbehind and global matching) and re-exports it as part of the public gguf API, which could have runtime compatibility or edge-case parsing impacts.

Overview
Adds parseGGUFFileVariant (and GGUFFileVariant) to extract filename-level GGUF variant tokens like LoRA, vocab, MTP, and imatrix, returning a deduped list in first-occurrence order with delimiter- and case-insensitive matching.

Re-exports the new parser/type via packages/gguf/src/gguf.ts, and expands gguf.spec.ts with coverage for ordering, deduping, delimiter handling (-/.), path stripping, and false-positive guards.

Reviewed by Cursor Bugbot for commit 47ad4c3. Bugbot is set up for automated code reviews on this repo. Configure here.

Adds parseGGUFFileVariant for filename-level GGUF variants:

- LoRA, vocab — spec <Type> slot values (GGUF naming spec)
- MTP        — spec <Type> slot value (ggml-org/ggml#1488, in-flight)
- imatrix    — community marker for importance-matrix calibration; not
               part of the spec <Type> slot but appears widely

Returns a GGUFFileVariant[] in first-occurrence order with exact
duplicates deduped. Plain model files return [].

Behaviour:
- delimited token match: token must be bounded by ^, -, or . on each
  side (so substrings inside other tokens — "imatrixed", "MTPiston" —
  don't false-match);
- case-insensitive match, canonical case returned;
- works for tokens in any position (prefix, middle, suffix) since real-
  world filenames put MTP / imatrix on either side of the encoding;
- `i1-` prefix is intentionally NOT recognized.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mishig25 mishig25 marked this pull request as ready for review May 13, 2026 09:27
Copy link
Copy Markdown
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for me but let's wait for @ngxson review before merge

Comment on lines +358 to +359
expect(parseGGUFFileVariant("Model-IQ2_XXS-imatrix.gguf")).toEqual(["imatrix"]);
expect(parseGGUFFileVariant("Model-imatrix-Q4_K_M.gguf")).toEqual(["imatrix"]); // prefix position
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure we want to support both orders? the spec defines one canonical order no?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants