Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ion-js crashes when parsing very large string. #728

Open
everett1992 opened this issue Nov 10, 2022 · 2 comments
Open

ion-js crashes when parsing very large string. #728

everett1992 opened this issue Nov 10, 2022 · 2 comments

Comments

@everett1992
Copy link
Contributor

ion-js crashes when parsing a very large string

$ npm ls ion-js
[email protected] /home/ANT.AMAZON.COM/calebev/test
└── [email protected]
$ node -v
v18.12.0
ion.load(JSON.stringify(Buffer.alloc(50_000_000).fill('a').toString('base64')))

$ node -p "require('ion-js').load(JSON.stringify(Buffer.alloc(50_000_000).fill('a').toString('base64')))"

<--- Last few GCs --->

[641427:0x560ab00]    21984 ms: Scavenge 2028.7 (2064.9) -> 2028.8 (2069.9) MB, 5.6 / 0.0 ms  (average mu = 0.198, current mu = 0.143) allocation failure;
[641427:0x560ab00]    21994 ms: Scavenge 2031.6 (2069.9) -> 2031.8 (2070.9) MB, 7.8 / 0.0 ms  (average mu = 0.198, current mu = 0.143) allocation failure;
[641427:0x560ab00]    22003 ms: Scavenge 2032.6 (2070.9) -> 2032.4 (2081.6) MB, 8.7 / 0.0 ms  (average mu = 0.198, current mu = 0.143) allocation failure;


<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0xb6e500 node::Abort() [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
 2: 0xa7e632  [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
 3: 0xd47ea0 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
 4: 0xd48247 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
 5: 0xf25605  [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
 6: 0xf26508 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
 7: 0xf36a13  [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
 8: 0xf37888 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
 9: 0xf3aa55 v8::internal::Heap::HandleGCRequest() [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
10: 0xeb8baf v8::internal::StackGuard::HandleInterrupts() [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
11: 0x12b7645 v8::internal::Runtime_StackGuard(int, unsigned long*, v8::internal::Isolate*) [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
12: 0x16e9979  [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]

JSON.parse parses the string fine.


I'm implementing a npm registry server and the npm publish api POST's a JSON object with the package tarball base64 encoded as a string. Parsing this body with regular JSON.parse causes multiple copies of data

const body = await buffer(request) // Buffer
const str = body.toString('utf8') // String
const  json = JSON.parse(str) // JSON
const tarball = Buffer.from(json._attachments["1.0.0"].data, 'base64') // tarball

I tested ion-js to see if the reader would save memory.

@zslayton
Copy link
Contributor

Just checking my understanding of your example:

require('ion-js')
    .load(JSON.stringify(Buffer.alloc(50_000_000).fill('a').toString('base64')))

We're creating a 50MB buffer filled with the ASCII character a, then converting that binary buffer to a base64-encoded string which will be about 67MB long. The resulting string will not be valid Ion syntax.

For example:

{{ImhlbGxvIHdvcmxkIg==}} // Text Ion blob
"ImhlbGxvIHdvcmxkIg=="   // Text Ion string

ImhlbGxvIHdvcmxkIg==      // Free-standing base64 data, not a valid Ion value 

There is indeed an OOM occurring, but I wanted to see whether this correctness issue affects your use case before diving too deeply on the memory management underlying especially large scalar values.

@everett1992
Copy link
Contributor Author

everett1992 commented Nov 24, 2022

The base64 string is passed to JSON.stringify so it will be a valid ion string.
Ignore Buffer / base64, it can be reproduced with

ion.load(JSON.stringify("a".repeat(50_000_000)))

I can also reproduce by feeding the output of dumpText back to load

const value = "a".repeat(3125000)
const ionText = ion.dumpText(value)
ion.load(ionText) // crashes here

Tho with larger strings dumpText will also crash

const value = "a".repeat(25_000_000)
const ionText = ion.dumpText(value) // crashes here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants