Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode/UTF-8 is unnecessarily escaped #37

Open
jujucheng opened this issue Aug 29, 2016 · 15 comments
Open

Unicode/UTF-8 is unnecessarily escaped #37

jujucheng opened this issue Aug 29, 2016 · 15 comments

Comments

@jujucheng
Copy link

the result is "\u963f"
ohmygod
the picture shows my trouble. can anyone help me?

@timhall
Copy link
Member

timhall commented Aug 29, 2016

Hi @jujucheng can you give a little more explanation of the issue? Does this describe the problem?

Input: JsonConverter.ConvertToJson(Array("阿"))
Expected: ["阿"]
Actual: ["\u963f"]

@timhall
Copy link
Member

timhall commented Aug 29, 2016

Ok, I have reproduced that issue, It looks like those characters may be escaped according to the spec, but don't need to be. I'll look into possibly changing it or adding an option for conditionally escaping Unicode/UTF-8.

@timhall timhall changed the title MsgBox JsonConverter.ConvertToJson("阿") Unicode/UTF-8 is unnecessarily escaped Aug 29, 2016
@jujucheng
Copy link
Author

ya, that the problem,QQ
really looking forward to a solution!

@vikct
Copy link

vikct commented Nov 4, 2016

Hi Tim, just wondering has this issue been fixed? cos i have encountered the same issue as well.

something like this: "SMS": "\u77ED\u4FE1"

@nextcrom
Copy link

nextcrom commented Nov 14, 2016

Hi,

This is just my resolution case and not general type.
I hope you get any hints here.

Reference site
'https://social.technet.microsoft.com/Forums/en-US/c1def5ab-7c60-4927-b828-f015c4853795/excel-file-to-utf8-encoded-text-file?forum=officesetupdeploylegacy

Set objStream = CreateObject("ADODB.Stream")

'----- Some your codes -----
'Remove or set to comment code below.
'myfile = Application.ActiveWorkbook.Path & "\data.json"
'Open myfile For Output As #1
'Print #1, ConvertToJson(items, Whitespace:=2)
'Close #1

    With objStream

        .Type = 2
        .Charset = "utf-8"
        .Open
        .WriteText ConvertToJson(items, Whitespace:=2)
        .SaveToFile myfile, 2
        .Close
    End With

In JsonConverter Module
Private Function json_Encode(ByVal json_Text As Variant) As String

'---- Codes ----

Remove or set to comment code below.
Case 0 To 31, 127 To 65535
' Non-ascii characters -> convert to 4-digit hex
json_Char = "\u" & VBA.Right$("0000" & VBA.Hex$(json_AscCode), 4)

@filetvignon
Copy link

@nextcrom that solution suited me perfectly, thank you!

@timhall
Copy link
Member

timhall commented Apr 6, 2018

Unfortunately all of the fixes for UTF-8 support are Windows-only. It's a major issue that I'm looking into, but may not have a good solution for a while.

@davidcie
Copy link

davidcie commented Jun 6, 2019

Spent a good few minutes today debuggin what was producing the strange escaped unicode in our Access workflow. Would be great if you come across a way to solve this in a cross-platform manner. In the meantime will look into the workaround @nextcrom kindly provided (thank you!).

@joyfullservice
Copy link

I have added pull request #168 to add an option to preserve the Unicode text instead of escaping it. If accepted, this pull request should close this issue.

@JonhSilver
Copy link

so any changes? as always has this problem

@JonhSilver
Copy link

also problem what always put \r to all text

@breshman
Copy link

breshman commented Nov 4, 2020

Para los tildes solucione de esta forma, los resaltados de amarillo los marque como comentario.
Como indica @nextcrom

Antes \u00C1NCASH
Después ÁNCASH

image

@sabatale
Copy link

sabatale commented Feb 7, 2021

The commit from @joyfullservice is not working, but @breshman is absolutely right!

@joyfullservice
Copy link

The commit from @joyfullservice is not working, but @breshman is absolutely right!

That probably depends on the type of output that you want. If you are writing to a file, some programs may need a UTF-8 BOM in the output file to properly render the extended characters.

@rcsilva81
Copy link

Worked like a charm for Portuguese- BR ... Tks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests