Skip to content

Commit 3ce3599

Browse files
committed
docs : add MTP to GGUF Type slot
Adds `MTP` as a third value for the `Type` slot (alongside `LoRA` and `vocab`) to cover Multi-Token Prediction / speculative-decoding draft modules shipped beside a base model. Updates the validation regex in both the prose and JS copies, adds a filename example, and extends the Node.js test cases.
1 parent a056a26 commit 3ce3599

1 file changed

Lines changed: 12 additions & 2 deletions

File tree

docs/gguf.md

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ The components are:
4444
- If missing, then file is by default a typical gguf tensor model file
4545
- `LoRA` : GGUF file is a LoRA adapter
4646
- `vocab` : GGUF file with only vocab data and metadata
47+
- `MTP` : GGUF file contains Multi-Token Prediction heads (a speculative-decoding draft module), intended to be loaded alongside a base model of matching architecture and version
4748
1. **Shard**: (Optional) Indicates and denotes that the model has been split into multiple shards, formatted as `<ShardNum>-of-<ShardTotal>`.
4849
- *ShardNum* : Shard position in this model. Must be 5 digits padded by zeros.
4950
- Shard number always starts from `00001` onwards (e.g. First shard always starts at `00001-of-XXXXX` rather than `00000-of-XXXXX`).
@@ -54,7 +55,7 @@ The components are:
5455

5556
At a minimum all model files should have at least BaseName, SizeLabel, Version, in order to be easily validated as a file that is keeping with the GGUF Naming Convention. An example of this issue is that it is easy for Encoding to be mistaken as a FineTune if Version is omitted.
5657

57-
To validate you can use this regular expression `^(?<BaseName>[A-Za-z0-9\s]*(?:(?:-(?:(?:[A-Za-z\s][A-Za-z0-9\s]*)|(?:[0-9\s]*)))*))-(?:(?<SizeLabel>(?:\d+x)?(?:\d+\.)?\d+[A-Za-z](?:-[A-Za-z]+(\d+\.)?\d+[A-Za-z]+)?)(?:-(?<FineTune>[A-Za-z0-9\s-]+))?)?-(?:(?<Version>v\d+(?:\.\d+)*))(?:-(?<Encoding>(?!LoRA|vocab)[\w_]+))?(?:-(?<Type>LoRA|vocab))?(?:-(?<Shard>\d{5}-of-\d{5}))?\.gguf$` which will check that you got the minimum BaseName, SizeLabel and Version present in the correct order.
58+
To validate you can use this regular expression `^(?<BaseName>[A-Za-z0-9\s]*(?:(?:-(?:(?:[A-Za-z\s][A-Za-z0-9\s]*)|(?:[0-9\s]*)))*))-(?:(?<SizeLabel>(?:\d+x)?(?:\d+\.)?\d+[A-Za-z](?:-[A-Za-z]+(\d+\.)?\d+[A-Za-z]+)?)(?:-(?<FineTune>[A-Za-z0-9\s-]+))?)?-(?:(?<Version>v\d+(?:\.\d+)*))(?:-(?<Encoding>(?!LoRA|vocab|MTP)[\w_]+))?(?:-(?<Type>LoRA|vocab|MTP))?(?:-(?<Shard>\d{5}-of-\d{5}))?\.gguf$` which will check that you got the minimum BaseName, SizeLabel and Version present in the correct order.
5859

5960
For example:
6061

@@ -81,12 +82,20 @@ For example:
8182
- Weight Encoding Scheme: Q4_0
8283
- Shard: 3 out of 9 total shards
8384

85+
* `Qwen3-27B-v1.0-Q4_K_M-MTP.gguf`
86+
- Model Name: Qwen3
87+
- Expert Count: 0
88+
- Parameter Count: 27B
89+
- Version Number: v1.0
90+
- Weight Encoding Scheme: Q4_K_M
91+
- Type: MTP (Multi-Token Prediction draft module)
92+
8493

8594
<details><summary>Example Node.js Regex Function</summary>
8695

8796
```js
8897
#!/usr/bin/env node
89-
const ggufRegex = /^(?<BaseName>[A-Za-z0-9\s]*(?:(?:-(?:(?:[A-Za-z\s][A-Za-z0-9\s]*)|(?:[0-9\s]*)))*))-(?:(?<SizeLabel>(?:\d+x)?(?:\d+\.)?\d+[A-Za-z](?:-[A-Za-z]+(\d+\.)?\d+[A-Za-z]+)?)(?:-(?<FineTune>[A-Za-z0-9\s-]+))?)?-(?:(?<Version>v\d+(?:\.\d+)*))(?:-(?<Encoding>(?!LoRA|vocab)[\w_]+))?(?:-(?<Type>LoRA|vocab))?(?:-(?<Shard>\d{5}-of-\d{5}))?\.gguf$/;
98+
const ggufRegex = /^(?<BaseName>[A-Za-z0-9\s]*(?:(?:-(?:(?:[A-Za-z\s][A-Za-z0-9\s]*)|(?:[0-9\s]*)))*))-(?:(?<SizeLabel>(?:\d+x)?(?:\d+\.)?\d+[A-Za-z](?:-[A-Za-z]+(\d+\.)?\d+[A-Za-z]+)?)(?:-(?<FineTune>[A-Za-z0-9\s-]+))?)?-(?:(?<Version>v\d+(?:\.\d+)*))(?:-(?<Encoding>(?!LoRA|vocab|MTP)[\w_]+))?(?:-(?<Type>LoRA|vocab|MTP))?(?:-(?<Shard>\d{5}-of-\d{5}))?\.gguf$/;
9099

91100
function parseGGUFFilename(filename) {
92101
const match = ggufRegex.exec(filename);
@@ -101,6 +110,7 @@ const testCases = [
101110
{filename: 'Grok-100B-v1.0-Q4_0-00003-of-00009.gguf', expected: { BaseName: 'Grok', SizeLabel: '100B', FineTune: null, Version: 'v1.0', Encoding: 'Q4_0', Type: null, Shard: "00003-of-00009"}},
102111
{filename: 'Hermes-2-Pro-Llama-3-8B-v1.0-F16.gguf', expected: { BaseName: 'Hermes-2-Pro-Llama-3', SizeLabel: '8B', FineTune: null, Version: 'v1.0', Encoding: 'F16', Type: null, Shard: null}},
103112
{filename: 'Phi-3-mini-3.8B-ContextLength4k-instruct-v1.0.gguf', expected: { BaseName: 'Phi-3-mini', SizeLabel: '3.8B-ContextLength4k', FineTune: 'instruct', Version: 'v1.0', Encoding: null, Type: null, Shard: null}},
113+
{filename: 'Qwen3-27B-v1.0-Q4_K_M-MTP.gguf', expected: { BaseName: 'Qwen3', SizeLabel: '27B', FineTune: null, Version: 'v1.0', Encoding: 'Q4_K_M', Type: 'MTP', Shard: null}},
104114
{filename: 'not-a-known-arrangement.gguf', expected: null},
105115
];
106116

0 commit comments

Comments
 (0)