Replies: 3 comments
-
I totally agree that adding a tensor_data_offset to the GGUF format would make life a lot easier for custom readers! The current options require either parsing the entire metadata, which adds unnecessary complexity, or modifying GGUF files to include a special offset. If the GGUF format supported this offset as part of the specification, it would make things much cleaner and more efficient. We could add an optional tensor_data_offset field in the GGUF header to allow readers to quickly access the tensor data, skipping over unnecessary metadata. This would also keep the format backward-compatible because older readers could simply ignore this field.
Code Example for Reader:
|
Beta Was this translation helpful? Give feedback.
-
This seems like a very narrow use case - generally one does not know the order and the dimensions of the tensor data. |
Beta Was this translation helpful? Give feedback.
-
In the above use case, I first read the 10 parameters of the GPT-2 model here: https://github.com/certik/fastGPT/blob/7d96ec2e23b2a1a07aea625f72661d3f650c7ee5/driver.f90#L106, then use those dimensions to properly read the rest of the arrays. So one doesn't need to know the dimensions, but one needs to know the order. The big advantage is that it's easy to read it in any computational code in any language. |
Beta Was this translation helpful? Give feedback.
-
When writing custom readers for GGUF, it would be very helpful to be able to skip the header and go directly to
tensor_data
to load all arrays (if I know their order and dimensions). Currently I have two options:tensor_data
as a first variable in the header (an example of this approach is here: https://github.com/certik/fastGPT/blob/7d96ec2e23b2a1a07aea625f72661d3f650c7ee5/driver.f90#L72)I use the second approach, which works great, but it means I have to create special GGUF files that have this
general.data_offset
first variable. If the GGUF format had this offset as part of the file format, then any GGUF file can be easily parsed with this approach. Many other file formats have a similar feature to skip sections that the reader does not care about by specifying the section length. This also makes it possible to use existing readers for future GGUF versions that might add more metadata types (the reader would simply skip the rest of the header if it doesn't support a given type).If implemented in the next GGUF version, this would make it easy to write a simple reader in custom computational code and reuse existing GGUF writers, thus making the GGUF format more versatile for almost any numerical computational work.
Related proposal: #3975.
Beta Was this translation helpful? Give feedback.
All reactions