Replies: 2 comments
-
What would be the advantage of 64-bit random UUID compared to a descriptive enum field? |
Beta Was this translation helpful? Give feedback.
-
It means at least for new prototype tensor scheme by organizations outside of ggml repo, they can test and release their own gguf with their own structure without needing to coordinate with ggml first. In the case of gguf and llama.cpp, you did mention how previously you tried to keep the list in sync between gguf and llama.cpp but it desynced thus the separation of the enum list. This 'hopefully' fixes that particular issue by decoupling the need to have sequential IDs. Anyway, I do suspect it's a bit too late to change it for this gguf version. But I imagine we could do it for v4 if it make sense to do so. But in general if we got some ID/Type that we would like to decouple from having to have a central coordination, we may want to encode an random id (And perhaps an extra optional descriptive string in the KV for us to do a survey of the most common name for it and it's potential meaning if the external orgs disappear or something). Schema Hints?Would it also be possible to add a schema hint to the KV store, so people can get an idea of the breakdown of the 'rest of the file' (whose implementation is usually found in llama.cpp). Ideally it should be enough that you could run a code generator and generate a header like say below? /* WARN AUTOGENERATED BY GGML */
typedef struct {
ggml_half delta;
uint8_t quants[QK4_0 / 2]; // Note that this is an array of 4bit weights encoded as 'quants'
} block_q4_0; But more so internally, what I'm hoping would be possible is to autogenerate a diagram of the gguf tensor schema breakdown of the file, to make it easier for developers to visualize what bitpacking to expect or something. We are obviously not going to store everything like how to use it as that's a matter for llama.cpp to explain etc... but being able to put a name to each variable would go a long way. To not be horribly large in encoding such hints in the KV store, instead of storing it as a json file... maybe CBOR? |
Beta Was this translation helpful? Give feedback.
-
This is more about the file format design.
Wondering why we can't have GGUF tensor scheme id be a randomly selected ID rather than as an enum?
This means you can have people experiment with different scheme then when it actually turns out useful you can then register it in the gguf spec. Seems more robust from a standards perspective
Beta Was this translation helpful? Give feedback.
All reactions