-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Format Decoder Conventions #746
Comments
Hey, there are no strict rules really, but i would advice to decode to a structure that tries to resemble the "intent" of the format if possible and also in way that is easy to query and work with. So for example if the format is a nested tree structure it probably makes sense to decode it as such (see the mp4 or msgpack decodes maybe for example) but if the format is more specialized lists of things then just have a root struct with arrays i think (see elf or macho possibly for example). Note that structs are automatically sorted by address but arrays are kept as they are. If unsure of the structure i usually just get decoding and then try it on some files to see how it feels and how some queries would work, if awkward i restructure. Do you have any specific formats in mind that you would like to add? |
I'm currently working on decoders for several different proprietary and unfortunately undocumented formats. |
I see, let me know how it goes, happy to help! and you can also email me if you want to share some code more in private. BTW hopefully in some future i will get time to finish up the kaitai support that might help with some custom decoders, see #627. But i would probably still use go for some formats depending on how fancy decoding one wants. |
After reading https://github.com/wader/fq/blob/master/doc/dev.md, I have some questions as to how to handle "non-linear" formats that make use of many absolute offsets.
For example,
A format that contains sections of data and lists of offsets that may be nested. Should the sections be decoded completely flat to allow the user to decide what to do with the offsets, or should they be decoded into a tree, using
SeekAbs
to locate the right section?A format that consists of a single list of offsets and then other sections of data. Should the sections be decoded flat, or as children of the list?
The text was updated successfully, but these errors were encountered: