Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds details about system macros and special forms #342

Merged
merged 4 commits into from
Oct 8, 2024

Conversation

popematt
Copy link
Contributor

Issue #, if available:

None

Description of changes:

Adds more details and examples for system macros and special forms.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

_books/ion-1-1/src/macros/special_forms.md Outdated Show resolved Hide resolved
_books/ion-1-1/src/macros/special_forms.md Outdated Show resolved Hide resolved
_books/ion-1-1/src/macros/special_forms.md Outdated Show resolved Hide resolved
_books/ion-1-1/src/macros/special_forms.md Outdated Show resolved Hide resolved
_books/ion-1-1/src/macros/special_forms.md Show resolved Hide resolved
However, in Ion 1.1, the system symbol and macro tables have a system address space that is distinct from the local address space.
When starting an Ion 1.1 segment (i.e. immediately after encountering an `$ion_1_1` version marker), the local symbol
table is prepopulated with the system symbols.
(Why? They can also be used via opcode `EE` which requires the same number of bytes as the `E1` OpCode.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good callout! There may not be a strong reason to pre-populate the symbol table. It lets you think of the initial state of the encoding module as "the system module," but that's not buying us very much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we had some more system macros that could make it easier/more efficient to write an encoding directive, then I could see how useful it would be to have all of those macros pre-populated into the 00 - 3F range.

I think, though, that the main benefit is for implementers. When you have a partially complete implementation, it's a lot easier to start testing things if you have some symbols and macros that are automatically populated into their respective user-spaces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we had some more system macros that could make it easier/more efficient to write an encoding directive, then I could see how useful it would be to have all of those macros pre-populated into the 00 - 3F range.

Do you mean system symbols?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I mean system macros. It wouldn't make a big difference in Ion text, but in Ion binary, having something like these could make a big difference if you have a lot of macros with relatively small bodies.

// Assume macro addresses starting at 20
(macro macro (flex_sym::name? parameters* body) ((.literal macro) name (parameters) body))
(macro u8_param (name) (.annotate (.. (.literal u8)) name))
(macro zero_or_one (p) (.values p (.literal ?)))

Let's compare some ways of declaring a macro. I'm using null as the macro body because it doesn't make a big deal in this particular example. (A simple {x:x, y:y, z:z} body would be 14 bytes, assuming that none of the field names are in the symbol table yet.) I'm using '<char> to represent the ascii byte for a character (I don't want to convert all of it to hex).

Macro definition Size Raw Bytes
(macro point3d
(uint8::x? uint8::y? uint8::z?)
null)
36 F2 EE 0D A7 'p 'o 'i 'n 't '3
'd F2 E7 01 95 A1 'x A1 '? E7
01 95 A1 'y A1 '? E7 01 95 A1
'z A1 '? F0 EA F0
(:macro point3d
(:: uint8::x? uint8::y? uint8::z?)
null)
33 20 06 F9 'p 'o 'i 'n 't '3 'd
2B E7 01 95 A1 'x A1 '? E7 01
95 A1 'y A1 '? E7 01 95 A1 'z
A1 '? EA
(:macro point2d
(:: (:zero_or_one (:u8_param x))
(:zero_or_one (:u8_param y))
(:zero_or_one (:u8_param z)))
null)
24 20 06 F9 'p 'o 'i 'n 't '3 'd
19 22 21 FF 'x 22 21 FF 'y 22
21 FF 'z EA

Similarly, macros that produce TDL invocations, such as this, could add even more savings.

(macro if_some (a* b* c*) (.if_some a b c))

Encoding (.if_some a b c) would require 11 bytes, whereas (:if_some a b c) would only require 7-8 bytes (depending on whether if_some has a 1 or 2 byte address). Similar savings could be gotten for every other system macro and special form. However, this would make it less user-friendly because using E-Expressions in TDL to produce more TDL adds to the cognitive burden, and it might not be feasible—I haven't considered macro hygiene at all in this example.

Since it seems like we're going to be using . for TDL macro invocations, even just having a (:dot) macro could save a byte every time we call a macro from TDL. (The symbol . always requires 2 bytes, whereas (:dot) could have a 1-byte address.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for explaining!

_books/ion-1-1/src/modules/system_module.md Outdated Show resolved Hide resolved
_books/ion-1-1/src/modules/system_module.md Outdated Show resolved Hide resolved
_books/ion-1-1/src/modules/system_module.md Show resolved Hide resolved
_books/ion-1-1/src/modules/system_module.md Outdated Show resolved Hide resolved
Copy link
Contributor

@zslayton zslayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things to consider before merging.

_books/ion-1-1/src/macros/special_forms.md Outdated Show resolved Hide resolved
_books/ion-1-1/src/macros/system_macros.md Show resolved Hide resolved
_books/ion-1-1/src/macros/special_forms.md Outdated Show resolved Hide resolved
However, in Ion 1.1, the system symbol and macro tables have a system address space that is distinct from the local address space.
When starting an Ion 1.1 segment (i.e. immediately after encountering an `$ion_1_1` version marker), the local symbol
table is prepopulated with the system symbols.
(Why? They can also be used via opcode `EE` which requires the same number of bytes as the `E1` OpCode.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we had some more system macros that could make it easier/more efficient to write an encoding directive, then I could see how useful it would be to have all of those macros pre-populated into the 00 - 3F range.

Do you mean system symbols?

Comment on lines +166 to +168
```ion
(macro make_struct (structs*) /* Not representable in TDL */)
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with leaving it as make_struct (structs*) 👍. The important thing is that we have a way to make a dynamic field name in TDL, which make_field provides.

_books/ion-1-1/src/macros/system_macros.md Show resolved Hide resolved
@popematt popematt merged commit a80193b into amazon-ion:gh-pages Oct 8, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants