Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python implementation of json.loads() accepts invalid unicode escapes #125660

Closed
nineteendo opened this issue Oct 17, 2024 · 6 comments
Closed

Python implementation of json.loads() accepts invalid unicode escapes #125660

nineteendo opened this issue Oct 17, 2024 · 6 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@nineteendo
Copy link
Contributor

nineteendo commented Oct 17, 2024

Bug report

Bug description:

While reviewing #125652 and reading the documentation of int(), I realised this condition in json.decoder is insufficient:

if len(esc) == 4 and esc[1] not in 'xX':

>>> import sys
>>> sys.modules["_json"] = None
>>> import json
>>> json.loads(r'["\u 000", "\u-000", "\u+000", "\u0_00"]')
['\x00', '\x00', '\x00', '\x00']

CPython versions tested on:

3.13

Operating systems tested on:

macOS

Linked PRs

@nineteendo nineteendo added the type-bug An unexpected behavior, bug, or error label Oct 17, 2024
@nineteendo
Copy link
Contributor Author

cc @serhiy-storchaka

@nineteendo
Copy link
Contributor Author

Maybe something like this? Although it might be a better idea to use a stricter function.

esc = s[end:end + 4].strip()
if "_" not in esc and len(esc) == 4 and esc[0] not in "+-" and esc[1] not in "xX":

@Eclips4 Eclips4 added the stdlib Python modules in the Lib dir label Oct 17, 2024
@serhiy-storchaka
Copy link
Member

Either this, or simply use regexp.

@nineteendo
Copy link
Contributor Author

Let's use a regex, unicode digits aren't allowed either:

>>> int("\uff10", 16)
0

@nineteendo
Copy link
Contributor Author

Could there be other code that suffers from this problem?

@taleinat
Copy link
Contributor

Could there be other code that suffers from this problem?

Possibly, but that's out of context for this issue. If someone finds such issues they should be reported separately.

miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 18, 2024
…tion of JSON decoder (pythonGH-125683)

(cherry picked from commit df75136)

Co-authored-by: Nice Zombies <[email protected]>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 18, 2024
…tion of JSON decoder (pythonGH-125683)

(cherry picked from commit df75136)

Co-authored-by: Nice Zombies <[email protected]>
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Oct 19, 2024
…pickler and unpickler

pickle.Pickler and pickle.Unpickler instances have now managed dicts.
Arbitrary instance attributes, including persistent_id and persistent_load,
can now be set.
serhiy-storchaka pushed a commit that referenced this issue Oct 21, 2024
…ation of JSON decoder (GH-125683) (GH-125694)

(cherry picked from commit df75136)

Co-authored-by: Nice Zombies <[email protected]>
serhiy-storchaka pushed a commit that referenced this issue Oct 21, 2024
…ation of JSON decoder (GH-125683) (GH-125695)

(cherry picked from commit df75136)

Co-authored-by: Nice Zombies <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
Status: Done
Development

No branches or pull requests

4 participants