Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-standard escaping of zero bytes #2

Open
dstelter opened this issue Feb 18, 2016 · 5 comments
Open

Non-standard escaping of zero bytes #2

dstelter opened this issue Feb 18, 2016 · 5 comments
Assignees

Comments

@dstelter
Copy link

I'm wondering why zero bytes are now double-escaped, which essentially breaks naive JSON parsing (e.g. using cjson).

Could you shed some light on the reasoning? What's wrong with \u0000?

@csanders-git
Copy link

perhaps this is something that needs to be fixed.

@dstelter
Copy link
Author

Perhaps @zimmerle knows more? He introduced the change from \u0000 to \u0000 after all. :-)

@zimmerle
Copy link
Contributor

Hi @dstelter,

Giving the example: '\u0000', the '' is not a escaping character, rather it is part of the unicode representation of '\0'.

In ModSecurity we use the YAJL library to load the JSON into memory. The YAJL library considers the content of a JSON to be a NULL-terminated string. This means that whenever it face a '\0' it considers (somewhat expected) the end of the string, thus, not reading the entire data. The escape character was placed to avoid the partial loading of a content. Once the content is load into memory the double slash (\u0000) is replaced by a single slash, therefore, representing again a NULL byte.

I opened a ticket on GitHub for yajl project (I don't remember the issue number) to ask for an api where I will be able to determinate the actually number of bytes to be read, disregard of the NULL byte. The developers, gave me an convince explanation that JSON should not transport such stuff as NULL byte. Maybe to use an transport encoding on top of it (base64?), but i didn't like the idea of loosing the readability, so I ended up adopting this creative way of represent the null byte: \u0000.

Is that a problem to you? Do you have a better idea? The library that you are using to load those JSON files is able to handle the \u0000 right?

@dstelter
Copy link
Author

Thanks for the clarification! I see your point now.
I'm using cjson (lua-cjson, to be precise), which handles null bytes correctly.

It is certainly not a big deal to "unfix" the \u0000 to \u0000, yet still somewhat confusing for test consumers. After all \u0000 is the canonical representation of null bytes.

Imo, such implementation quirks should be kept out of the test data and rather be dealt with by some loading shim.
I'd be happy to prepare a PR for ModSecurity if you point me to the test runner code. :-)

@zimmerle
Copy link
Contributor

Hi @dstelter,

I agree with you, i would prefer to have the \u0000 as \u0000 not \u0000. Your help will be more than appreciate.

The code for the test utility is available here:
https://github.com/SpiderLabs/ModSecurity/tree/libmodsecurity/test/unit

The specific part where the \u0000 is replaced, is here:
https://github.com/SpiderLabs/ModSecurity/blob/libmodsecurity/test/unit/unit_test.cc

We don't want to add another dependency (e.g. another JSON library), and I would prefer not to replace the YAJL, as it is used in different parts of the code, unless it is really necessary.

Notice that the tests repository is a gitsub repo, inside the libmodsecurity branch.

Thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants