Non-standard escaping of zero bytes #2

dstelter · 2016-02-18T10:19:09Z

I'm wondering why zero bytes are now double-escaped, which essentially breaks naive JSON parsing (e.g. using cjson).

Could you shed some light on the reasoning? What's wrong with \u0000?

csanders-git · 2016-02-26T02:21:41Z

perhaps this is something that needs to be fixed.

dstelter · 2016-02-29T09:11:17Z

Perhaps @zimmerle knows more? He introduced the change from \u0000 to \u0000 after all. :-)

zimmerle · 2016-02-29T12:53:52Z

Hi @dstelter,

Giving the example: '\u0000', the '' is not a escaping character, rather it is part of the unicode representation of '\0'.

In ModSecurity we use the YAJL library to load the JSON into memory. The YAJL library considers the content of a JSON to be a NULL-terminated string. This means that whenever it face a '\0' it considers (somewhat expected) the end of the string, thus, not reading the entire data. The escape character was placed to avoid the partial loading of a content. Once the content is load into memory the double slash (\u0000) is replaced by a single slash, therefore, representing again a NULL byte.

I opened a ticket on GitHub for yajl project (I don't remember the issue number) to ask for an api where I will be able to determinate the actually number of bytes to be read, disregard of the NULL byte. The developers, gave me an convince explanation that JSON should not transport such stuff as NULL byte. Maybe to use an transport encoding on top of it (base64?), but i didn't like the idea of loosing the readability, so I ended up adopting this creative way of represent the null byte: \u0000.

Is that a problem to you? Do you have a better idea? The library that you are using to load those JSON files is able to handle the \u0000 right?

dstelter · 2016-02-29T13:24:13Z

Thanks for the clarification! I see your point now.
I'm using cjson (lua-cjson, to be precise), which handles null bytes correctly.

It is certainly not a big deal to "unfix" the \u0000 to \u0000, yet still somewhat confusing for test consumers. After all \u0000 is the canonical representation of null bytes.

Imo, such implementation quirks should be kept out of the test data and rather be dealt with by some loading shim.
I'd be happy to prepare a PR for ModSecurity if you point me to the test runner code. :-)

zimmerle · 2016-02-29T13:33:21Z

Hi @dstelter,

I agree with you, i would prefer to have the \u0000 as \u0000 not \u0000. Your help will be more than appreciate.

The code for the test utility is available here:
https://github.com/SpiderLabs/ModSecurity/tree/libmodsecurity/test/unit

The specific part where the \u0000 is replaced, is here:
https://github.com/SpiderLabs/ModSecurity/blob/libmodsecurity/test/unit/unit_test.cc

We don't want to add another dependency (e.g. another JSON library), and I would prefer not to replace the YAJL, as it is used in different parts of the code, unless it is really necessary.

Notice that the tests repository is a gitsub repo, inside the libmodsecurity branch.

Thanks!!

victorhora added the enhancement label Apr 9, 2019

victorhora assigned zimmerle Apr 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-standard escaping of zero bytes #2

Non-standard escaping of zero bytes #2

dstelter commented Feb 18, 2016

csanders-git commented Feb 26, 2016

dstelter commented Feb 29, 2016

zimmerle commented Feb 29, 2016

dstelter commented Feb 29, 2016

zimmerle commented Feb 29, 2016

Non-standard escaping of zero bytes #2

Non-standard escaping of zero bytes #2

Comments

dstelter commented Feb 18, 2016

csanders-git commented Feb 26, 2016

dstelter commented Feb 29, 2016

zimmerle commented Feb 29, 2016

dstelter commented Feb 29, 2016

zimmerle commented Feb 29, 2016