-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider string normalisation #18
Comments
The probably best way to implement this is by:
|
NFC string normalisation is probably best suited, as combining codepoints where possible would minimise file size and reduce complexity of width calculations in UI:s. (Even when you do the right thing and use unicode-width it will do a lookup per codepoint, so reducing the number of codepoints is an improvement. And if you do the wrong thing and assume every codepoint is 1 character you are wrong by less with NFC normalisation.) |
As this could cause issues it should be possible to disable without recompilation, as a workaround. This requires handing around some manner of flag to all code locations where normalisation would occur, so that either all or no text is normalised. This becomes very tricky if we do normalisation in |
As different representations of the same visible string should match in regex operations it would probably be good to normalise both buffer contents and commands.
Details exist in here: https://tonsky.me/blog/unicode/
The text was updated successfully, but these errors were encountered: