Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surrogate pair #11

Open
rudolph-miller opened this issue Nov 17, 2015 · 2 comments
Open

Surrogate pair #11

rudolph-miller opened this issue Nov 17, 2015 · 2 comments

Comments

@rudolph-miller
Copy link

(yason:parse "\"\\uD840\\uDC0B\"")
;; => "𠀋"

(with-input-from-string (stream "\"\\uD840\\uDC0B\"")
  (cl-json:decode-json stream))
;; => "��"
@leosongwei
Copy link

leosongwei commented Nov 3, 2016

This "bug" is at decoder.lisp(around line 160):

                 ((len rdx)
                  (let ((code
                         (let ((repr (make-string len)))
                           (dotimes (i len)
                             (setf (aref repr i) (read-char stream)))
                           (handler-case (parse-integer repr :radix rdx)
                             (parse-error ()
                               (json-syntax-error stream esc-error-fmt
                                                  (format nil "\\~C" c)
                                                  repr))))))
                    (restart-case
                        (or (and (< code char-code-limit) (code-char code))
                            (error 'no-char-for-code :code code))

Escape sequence "\u" is just split and encoded in separate characters and then returned. Surrogate pair is just not implemented in CL-JSON.

I've got a dirty hack:

(progn
  (setf xxx (with-input-from-string (stream "\"\\uD83D\\uDE03\"")
              (cl-json:decode-json stream)))

  (princ (code-char
          (let ((c1 (char-code (aref xxx 0)))
                (c2 (char-code (aref xxx 1))))
            (+ #x10000
               (ash (logand #x03FF c1) 10)
               (logand #x03FF c2))))))

=>
😃
#\SMILING_FACE_WITH_OPEN_MOUTH

@eadmund
Copy link

eadmund commented Sep 7, 2018

This also causes a rather nasty failure to handle output:

(json:decode-json-from-string "\"\\uD83D\\uDE02\\uD83D\\uDE02\"")

I suggest using YASON instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants