Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Unicode encodings #9

Open
Serentty opened this issue Oct 17, 2019 · 6 comments
Open

Support Unicode encodings #9

Serentty opened this issue Oct 17, 2019 · 6 comments

Comments

@Serentty
Copy link

It would be convenient if Millfork allowed string literals for UTF-8 and UTF-16. This might seem like an odd feature to have, but I've been working on supporting rendering Unicode text on the Commander X16, especially CJK text. UTF-8 would also make sense for platforms such as the Altair 8800 where encoding is mostly a matter of agreement between the software running on the computer and the terminal being used to access it.

KarolS added a commit that referenced this issue Oct 17, 2019
@KarolS
Copy link
Owner

KarolS commented Oct 17, 2019

Done. It will be available in the next release.

@Serentty
Copy link
Author

Wow, that was fast! Thanks.

@Serentty
Copy link
Author

I just saw that for UTF-8, it only allows characters within the BMP. What's the reasoning behind this?

@Serentty Serentty reopened this Oct 17, 2019
@KarolS
Copy link
Owner

KarolS commented Oct 18, 2019

Honestly, the reason was me being too lazy to mess with the Char-oriented code too much, and Chars are limited to U+FFFF. But a proper fix wasn't even that hard, so I implemented it as well.

KarolS added a commit that referenced this issue Oct 18, 2019
@Serentty
Copy link
Author

Yay! And sorry to keep bringing things up on the same issue, but I feel like it would be even more spammy to keep opening new ones. Would it be possible to add a compile-time flag to disable the printable ASCII limitation in code, so that such characters can be used in identifiers? Or is there a technical aspect in the parser that prevents that? I know that Unicode equivalency is a big can of worms, but frankly I would be satisfied if it just treated identifiers blindly without worrying about normalization or anything like that.

If you're sick of Unicode by now, I would be happy to make a PR for this if you would be willing to merge it.

@KarolS
Copy link
Owner

KarolS commented Oct 24, 2019

0.3.10 has been released and contains the UTF support for string literals.

I'll leave the identifiers on the table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants