Skip to content

Commit

Permalink
Define Ion 1.1 system symbols with separate SID space (#338)
Browse files Browse the repository at this point in the history
  • Loading branch information
popematt committed Aug 30, 2024
1 parent 77fd91c commit f858c7a
Show file tree
Hide file tree
Showing 6 changed files with 511 additions and 12 deletions.
2 changes: 2 additions & 0 deletions _books/ion-1-1/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
- [Introduction](./introduction.md)
- [What's new](./whats_new.md)
- [Macros by example](macros_by_example.md)
- [Modules](modules.md)
- [System Module](modules/system_module.md)
- [Binary encoding](binary/encoding.md)
- [Encoding primitives](binary/primitives.md)
- [`FlexUInt`](binary/primitives/flex_uint.md)
Expand Down
21 changes: 20 additions & 1 deletion _books/ion-1-1/src/binary/e_expressions.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,10 +98,29 @@ Address : 142918

<div class="warning">

> This section was obsolete and needs to be rewritten.
> This section needs more details.
</div>

The opcode is `0xEE`. The macro address is given as a trailing [FlexUInt](primitives/fixed_uint.md) with no bias.


## System Macro Invocations

E-expressions that invoke a [system macro](../modules/system_module.md#system-macro-addresses) can be encoded using the `0xEF` opcode followed by a _positive_ 1-byte `FixedInt`.
(Negative values are used for [system_symbols](values/symbol.md#system-symbols).)

##### Encoding of the system macro `values`
```
┌──── Opcode 0xEF indicates a system symbol or macro invocation
│ ┌─── FixedInt 0 indicates macro 0 from the system macro table
│ │
EF 00
```

In addition, system macros MAY be invoked using any of the `0x00`-`0x5F` or `0xEE` opcodes, provided that the macro being invoked has been given an address in user macro address space.
<!-- TODO: Add or link an example of how this can be done. /-->

## Tagged E-expression Argument Encoding

When a macro parameter has a tagged type, the encoding of that parameter's corresponding argument in an E-expression
Expand Down
42 changes: 31 additions & 11 deletions _books/ion-1-1/src/binary/primitives/flex_sym.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,7 @@ A `FlexSym` begins with a [`FlexInt`](#flexint); once this integer has been read
No more bytes follow.
* **less than zero**, its absolute value represents a number of UTF-8 bytes that follow the `FlexInt`. These bytes
represent the symbol’s text.
* **exactly zero**, another byte follows that is an [opcode](opcodes.md). The `FlexSym` parser is not responsible for
evaluating this opcode, only returning it—the caller will decide whether the opcode is legal in the current context.
Example usages of the opcode include:
* Representing SID `$0` as `0xA0`.
* Representing the empty string (`""`) as `0x90`.
* When used to encode a struct field name, the opcode can invoke a macro that will evaluate to a struct whose key/value
pairs are spliced into the parent [struct](../values/struct.md).
* In a <<delimited_structs, delimited struct>>, terminating the sequence of `(field name, value)` pairs with `0xF0`.
* **exactly zero**, another byte follows that is a [`FlexSymOpCode`](#flexsymopcode).

#### `FlexSym` encoding of symbol ID `$10`
```
Expand All @@ -40,13 +33,40 @@ pairs are spliced into the parent [struct](../values/struct.md).
negative 5
```

### `FlexSymOpCode`

`FlexSymOpCode`s are a combination of [system symbols](../../modules/system_module.md#system-symbols) and a subset of the general [opcodes](../opcodes.md).
The `FlexSym` parser is not responsible for evaluating a `FlexSymOpCode`, only returning it—the caller will decide whether the opcode is legal in the current context.

Example usages of the `FlexSymOpCode` include:
* Representing SID `$0`
* Representing system symbols
* Note that the empty symbol (i.e. the symbol `''`) is now a system symbol and can be referenced this way.
* When used to encode a struct field name, the opcode can invoke a macro that will evaluate
to a struct whose key/value pairs are spliced into the parent [struct](../values/struct.md).
* In a [delimited struct](../values/struct.md#delimited-encoding), terminating the sequence of `(field name, value)` pairs with `0xF0`.


| OpCode Byte | Meaning | Additional Notes |
|:---------------:|:----------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `0x00` - `0x5F` | E-Expression | May be used when the `FlexSym` occurs in the field name position of any struct |
| `0x60` | Symbol with unknown text (also known as `$0`) | |
| `0x61` - `0xDF` | System SID (with `0x60` bias) | While the range of `0x61` - `0xDF` is reserved for system symbols, not all of these bytes correspond to a system symbol. See [system symbols](../../modules/system_module.md#system-symbols) for the list of system symbols. |
| `0xEE` | _TODO: Add meaning_ | |
| `0xEF` | E-Expression invoking a system macro | May be used when the `FlexSym` occurs in the field name position of any struct |
| `0xF0` | Delimited container end marker | May only be when the `FlexSym` occurs in the field name position of a delimited struct |
| `0xF5` | Length-prefixed macro invocation | May be used when the `FlexSym` occurs in the field name position of any struct |



#### `FlexSym` encoding of `''` (empty text) using an opcode
```
┌─── The leading FlexInt ends in a `1`,
│ no more FlexInt bytes follow.
0 0 0 0 0 0 0 1 10010000
0 0 0 0 0 0 0 1 01110111
└─────┬─────┘ └───┬──┘
2's comp. opcode 0x90:
zero empty symbol
2's comp. FixedInt 0x77,
zero System SID 23
(the empty symbol)
```
19 changes: 19 additions & 0 deletions _books/ion-1-1/src/binary/values/symbol.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,22 @@ address that is decoded is biased by the number of addresses that can be encoded
| `0xE1` | 0 to 255 | 0 |
| `0xE2` | 256 to 65,791 | 256 |
| `0xE3` | 65,792 to infinity | 65,792 |


### System Symbols

<!-- TODO: Add link to "system-module" page somewhere in this section. /-->

System symbols (that is, symbols defined in the system module) can be encoded using the `0xEF` opcode followed by a _negative_ 1-byte `FixedInt`.
(Positive values are used for [system macro invocations](../e_expressions.md#system-macro-invocations).)

Unlike Ion 1.0, symbols are not required to use the lowest available SID for a given text, and system symbols
_MAY_ be encoded using other SIDs.

##### Encoding of the system symbol `$ion`
```plain
┌──── Opcode 0xEF indicates a system symbol or macro invocation
│ ┌─── FixedInt -1 indicates system symbol 1
│ │
EF FF
```
5 changes: 5 additions & 0 deletions _books/ion-1-1/src/modules.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Ion 1.1 Modules

Modules are a generalization of symbol tables found in Ion 1.0.

<!-- TODO: More details /-->
Loading

0 comments on commit f858c7a

Please sign in to comment.