-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MultiByte Encoding Support #81
Comments
couldn't you explain the subj a bit more verbose? tia |
@dvv I just changed the title to MultiByte Encoding Support |
Strings in Lua may contain any 8-bit value, including embedded zeros, which can be specified as ‘\0’, So we can use any multibyte character encoding such as Shift_JIS or EUC-JIS as well as UTF-8.
We would like to define APIs to satisfy these goals. So more than a just simple API like convert(src, dest_encoding) returns dest is needed. @dvv I hope this explanation is clear enough. |
We should decide rules for encodings for strings and cBuffers. It's just an idea, my take is to use only UTF-8 for strings and any encoding for cBuffers. However, I have not thought about it thoroughly yet, so i'm not sure this actually works. Maybe any encoding for both strings and cBuffers is a better way. |
right. as previously stated, imho it's better to keep small and clean as far as we can, so utf-8 should be enough for starters. |
Something like Unix pipe would be best, for example, shift_jis -> UTF-8 -> to_lower.
Avoid iconv because of license incompatibility.
Maybe we can use some parts of libnkf, bsdconv, PHP's mbstring.
This is a very important subject, so let's take time for consideration.
The text was updated successfully, but these errors were encountered: