-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds details about system macros and special forms #342
Conversation
However, in Ion 1.1, the system symbol and macro tables have a system address space that is distinct from the local address space. | ||
When starting an Ion 1.1 segment (i.e. immediately after encountering an `$ion_1_1` version marker), the local symbol | ||
table is prepopulated with the system symbols. | ||
(Why? They can also be used via opcode `EE` which requires the same number of bytes as the `E1` OpCode.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good callout! There may not be a strong reason to pre-populate the symbol table. It lets you think of the initial state of the encoding module as "the system module," but that's not buying us very much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we had some more system macros that could make it easier/more efficient to write an encoding directive, then I could see how useful it would be to have all of those macros pre-populated into the 00
- 3F
range.
I think, though, that the main benefit is for implementers. When you have a partially complete implementation, it's a lot easier to start testing things if you have some symbols and macros that are automatically populated into their respective user-spaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we had some more system macros that could make it easier/more efficient to write an encoding directive, then I could see how useful it would be to have all of those macros pre-populated into the 00 - 3F range.
Do you mean system symbols?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I mean system macros. It wouldn't make a big difference in Ion text, but in Ion binary, having something like these could make a big difference if you have a lot of macros with relatively small bodies.
// Assume macro addresses starting at 20
(macro macro (flex_sym::name? parameters* body) ((.literal macro) name (parameters) body))
(macro u8_param (name) (.annotate (.. (.literal u8)) name))
(macro zero_or_one (p) (.values p (.literal ?)))
Let's compare some ways of declaring a macro. I'm using null
as the macro body because it doesn't make a big deal in this particular example. (A simple {x:x, y:y, z:z}
body would be 14 bytes, assuming that none of the field names are in the symbol table yet.) I'm using '<char>
to represent the ascii byte for a character (I don't want to convert all of it to hex).
Macro definition | Size | Raw Bytes |
---|---|---|
(macro point3d (uint8::x? uint8::y? uint8::z?) null) |
36 | F2 EE 0D A7 'p 'o 'i 'n 't '3 'd F2 E7 01 95 A1 'x A1 '? E7 01 95 A1 'y A1 '? E7 01 95 A1 'z A1 '? F0 EA F0 |
(:macro point3d (:: uint8::x? uint8::y? uint8::z?) null) |
33 | 20 06 F9 'p 'o 'i 'n 't '3 'd 2B E7 01 95 A1 'x A1 '? E7 01 95 A1 'y A1 '? E7 01 95 A1 'z A1 '? EA |
(:macro point2d (:: (:zero_or_one (:u8_param x)) (:zero_or_one (:u8_param y)) (:zero_or_one (:u8_param z))) null) |
24 | 20 06 F9 'p 'o 'i 'n 't '3 'd 19 22 21 FF 'x 22 21 FF 'y 22 21 FF 'z EA |
Similarly, macros that produce TDL invocations, such as this, could add even more savings.
(macro if_some (a* b* c*) (.if_some a b c))
Encoding (.if_some a b c)
would require 11 bytes, whereas (:if_some a b c)
would only require 7-8 bytes (depending on whether if_some
has a 1 or 2 byte address). Similar savings could be gotten for every other system macro and special form. However, this would make it less user-friendly because using E-Expressions in TDL to produce more TDL adds to the cognitive burden, and it might not be feasible—I haven't considered macro hygiene at all in this example.
Since it seems like we're going to be using .
for TDL macro invocations, even just having a (:dot)
macro could save a byte every time we call a macro from TDL. (The symbol .
always requires 2 bytes, whereas (:dot)
could have a 1-byte address.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for explaining!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things to consider before merging.
However, in Ion 1.1, the system symbol and macro tables have a system address space that is distinct from the local address space. | ||
When starting an Ion 1.1 segment (i.e. immediately after encountering an `$ion_1_1` version marker), the local symbol | ||
table is prepopulated with the system symbols. | ||
(Why? They can also be used via opcode `EE` which requires the same number of bytes as the `E1` OpCode.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we had some more system macros that could make it easier/more efficient to write an encoding directive, then I could see how useful it would be to have all of those macros pre-populated into the 00 - 3F range.
Do you mean system symbols?
```ion | ||
(macro make_struct (structs*) /* Not representable in TDL */) | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with leaving it as make_struct (structs*)
👍. The important thing is that we have a way to make a dynamic field name in TDL, which make_field
provides.
Issue #, if available:
None
Description of changes:
Adds more details and examples for system macros and special forms.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.