-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-8 characters encoded in 4 bytes not supported in filenames #598
Comments
@mmakassikis Do you have the time to fix it ? I guess that we need to compare unicode.c in ksmbd and cifs_unicode.c. cifs.ko seems to use utf8_to_utf32 or utf8s_to_utf16s instead of ->char2uni. char2uni doesn't fully support utf8. |
@nvxos Can you check if problem is improved with the following patch ? (namjaejeon@f389804) |
@namjaejeon |
@mmakassikis Thanks for your review. I updated the patch(namjaejeon@8dffdce). Let me know if you find any issue. |
Trying to use UTF-8 characters encoded in 4 bytes in filenames doesn't seem supported. It triggers a "file or folder does not exist" error in Windows for example. Same type of error on Linux.
The characters in question are for example some emojis like "🔥" (https://apps.timwhitlock.info/unicode/inspect/hex/1F525), while characters encoded in 3 bytes or less don't seem to pose a problem like for example the emoji "❤️" (https://apps.timwhitlock.info/unicode/inspect?s=%E2%9D%A4%EF%B8%8F).
Some reference I found about the subject and other helpful links I used to pinpoint the issue:
https://en.wikipedia.org/wiki/Unicode#Code_planes_and_blocks
https://apps.timwhitlock.info/emoji/tables/unicode
https://apps.timwhitlock.info/unicode/inspect
For some context, this issue happened on a deployment of KSMBD on FreeboxOS, a french ISP (Free) router (Freebox) OS. After issuing a ticket on their bug tracker (https://dev.freebox.fr/bugs/task/38504), they asked me to issue a ticket here.
@mmakassikis
The text was updated successfully, but these errors were encountered: