-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User guide documentation update #5
base: master
Are you sure you want to change the base?
Conversation
Add an example showing how to use hexadecimal notation when defining enumerations (enums).
Add an example to show how enumeration values can be referred to in switch-on/cases constructs by identifier instead of integer value.
This is a first attempt at describing how streams and substreams work in Kaitai Struct. This documentation is based on advice provided at: kaitai-io/kaitai_struct#145 (comment)
Reflects changes made in kaitai-io/kaitai_struct_compiler@ba114f2
Oh, bummer ;) I've just realized that I wrote streams/substreams section as well on April, 9th, and just forgot to push it into public... |
I'm so sorry... I'll try to think of a way to merge both these sections into one. |
Two sets of documentation is better than zero! :) I'll also see about updating this branch to ensure it can be merged. Is the content of these proposed documentation updates accurate? Feedback appreciated on any inaccuracies or suggestions for rewording. |
thus an exception occurs. The fact that the root stream still has | ||
1001 bytes available to be requested from the input file does not | ||
matter, as the `body` substream never has the opportunity to request | ||
any more than the first 1000 bytes of the input file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually not a pitfall, but a legitimate behavior, and well-explained in previous section.
The "pitfall" I was thinking about in this section is the following: when a new substream is created, all parse instances with positions act within that substream by default.
So, this one works as expected:
seq:
- id: skipped
size: 1000
- id: indexing
type: file_index_entry
# but adding "size: 24" here will ruin "file_body" instance,
# although it looks legitimate at the first glance
types:
file_index_entry:
seq:
- id: file_name
type: str
size: 16
- id: file_pos
type: u4
- id: file_len
type: u4
instances:
file_body:
pos: file_pos
size: file_len
To overcome that, one needs to use something like io: _root._io
in file_body
. Of course, documentation warrants a somewhat better example and explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent. I didn't know about io:
either, so that's a good one to document! Nice feature!
@@ -380,6 +380,21 @@ enums: | |||
17: udp | |||
---- | |||
|
|||
Alternatively, hexadecimal notation can also be used to define an enumeration: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally ok, but I'd also noted that this is a service provided by YAML, not something specific to KS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that a new section of the document could be created for general syntax and a very brief overview of YAML and what it provides. This example I provided may be better suited there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some Construct features are Python features, but I would advertise them just the same. Purpose of documentation is to show capabilities, not attribution. =) Just saying.
@@ -832,6 +832,39 @@ other value which was not listed explicitly. | |||
_: rec_type_unknown | |||
---- | |||
|
|||
If an enumeration has already been defined, you can use references to | |||
items in the enumeration instead of specifying integers a second time: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, if you defined key
as enum, then you don't have much choice. You can't compare enums to integers without additional conversions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm good point, I'll update the text accordingly
#... | ||
data_field_depth: | ||
seq: | ||
#... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pedantic person in me cries for that misaligned #...
;)
And, anyway, seq
is totally optional, so may be it's better to wrap it up as:
types:
data_field_width: # ...
data_field_height: # ...
data_field_depth: # ...
for brevity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, agreed
and cannot request data be provided out of sequential order. A stream | ||
knows the maximum amount of data available to be requested by the | ||
parser and the actual amount of data which has already been | ||
requested by the parser. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This explanation is pretty abstract and somewhat misleading. "Stream" can be re-read as many times as needed, and it can be seeked: that's exactly how positional parse instances work, they use seek
operations on a stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll think of another way to explain streams then, especially with reference to how pos:
works (seeking) and how io:
can be used to designate which stream to use.
stream. The root stream will know the maximum amount of data available | ||
to be requested by the parser as the file size of the input file which | ||
is being parsed. Initially, the root stream will know that 0 bits of | ||
data have been requested by the parser. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Streams can be used on in-memory byte arrays too, not necessarily files (which have file sizes). And, actually, stream does not "know" full file size, but it can query it on demand. File size can change if file is modified when KS parsing is in progress, so it's actually ok to have _io.size
to return varying values in different points in time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great point, probably one worth adding to the pitfalls section (or troubleshooting or similar) for the few people who may encounter the issue and not understand what is going on.
This thread is stale, but could you resolve it before I work on any documentation? |
Yeah, I just need to find some time to finish it... |
I will try to help you finish it, with the little understanding of how Kaitai works that I have. |
That would be most awesome %) |
I will carefully review this PR and attempt to solve any existing issues. |
No description provided.