-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Applies various fixes to writing and reading of Ion 1.1 system symbols in binary. #994
Conversation
} | ||
if (valueTid == null || IonType.STRING != valueTid.type) { | ||
throwDueToInvalidType(IonType.STRING); | ||
String value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from IonReaderContinuableApplicationBinary. Behavior for Ion 1.0 remains the same because getSymbol(sid)
always returns null in that case. For Ion 1.1, symbols with known text (due to being system symbols or being read from encoding directives processed by the core reader) are now resolved. This enables a system-level (not encoding-level) transcode.
@@ -1031,34 +1009,6 @@ public SymbolTable getSymbolTable() { | |||
return cachedReadOnlySymbolTable; | |||
} | |||
|
|||
@Override | |||
public String stringValue() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to IonReaderContinuableCoreBinary.
@@ -1116,7 +1118,7 @@ String getSymbol(int sid) { | |||
*/ | |||
boolean matchesSystemSymbol_1_1(Marker marker, SystemSymbols_1_1 systemSymbol) { | |||
if (marker.typeId == IonTypeID.SYSTEM_SYMBOL_VALUE) { | |||
return marker.endIndex == systemSymbol.getId(); | |||
return systemSymbol.getText().equals(getSystemSymbolToken(marker).getText()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
System symbol values are marked using the start and end index of the FixedUInt byte containing the symbol ID. getSystemSymbolToken
properly handles that case.
int minorVersion = getIonMinorVersion(); | ||
resetImports(getIonMajorVersion(), minorVersion); | ||
if (minorVersion > 0) { | ||
// TODO reset macro table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A future PR will work on adding encoding-level transcoding of Ion 1.1. This will be resolved in that PR.
@@ -2464,6 +2505,7 @@ protected final SymbolToken getSystemSymbolToken(Marker marker) { | |||
if (marker.startIndex == -1) { | |||
id = marker.endIndex; | |||
} else { | |||
prepareScalar(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this, failures occurred when the system symbol token's FixedUInt byte occurred beyond the data currently buffered.
@@ -292,7 +306,7 @@ public <T> T asFacet(Class<T> facetType) { | |||
|
|||
@Override | |||
public SymbolTable getSymbolTable() { | |||
// TODO generalize for Ion 1.1 | |||
// TODO generalize for Ion 1.1, whose system symbol table is not necessarily active. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't affect the functionality of the reader because this method is no longer used internally and is unlikely to be used often externally. Nevertheless, this should be fixed once we have a SymbolTable
-abstraction view of the Ion 1.1 system symbol table.
// TODO change the following once the Ion 1.1 symbol table is finalized. Probably: | ||
// sid10 -> sid10 - SystemSymbols.ION_1_0_MAX_ID; | ||
IntTransformer ION_1_0_SID_TO_ION_1_1_SID = IDENTITY_INT_TRANSFORMER; | ||
IntTransformer ION_1_0_SID_TO_ION_1_1_SID = sid -> sid - SystemSymbols.ION_1_0_MAX_ID; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: ion-java-benchmark-cli uses this when transcoding data from Ion 1.0 to Ion 1.1.
@@ -761,7 +765,6 @@ internal class IonManagedWriter_1_1( | |||
|
|||
// Just in case—call finish to flush the current system values, then user values, and then write the IVM. | |||
finish() | |||
userData.writeIVM() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually harmful, as it would result in an IVM being written before the user data but after the system data. This caused any systemData encoding context to be dropped. It can be removed entirely because finish()
sets needsIvm
to true.
@@ -34,7 +34,6 @@ class Ion_1_1_RoundTripTest { | |||
override val newWriterForAppendable: (Appendable) -> IonWriter = builder::build | |||
} | |||
|
|||
@Disabled("Disabled because the text reader does not support Ion 1.1 encoding directives.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to delete this when text support for encoding directives was added. The tests pass.
0xE7, 0x01, 0x6A, // One FlexSym annotation, with opcode, opcode 6A = system symbol A = $ion_encoding | ||
0xC6, // ( | ||
0xC5, 0xEE, 0x0F, // S-exp, system symbol 0xF = symbol_table | ||
0xB2, 0x91, 'a', // ["a"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'a'
Nice! I had forgotten you can do this.
Description of changes:
0xEE
opcode and FlexSym system opcodes.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.