Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Support char8_t #98468

Closed
wants to merge 1 commit into from
Closed

Conversation

Repiteo
Copy link
Contributor

@Repiteo Repiteo commented Oct 23, 2024

One of the additions to C++20 is the new type char8_t1, functioning as the UTF-8 equivalent to char16_t and char32_t. Despite being a C++20 feature, this type can be safely backported to C++17 using -fchar8_t (/Zc:char8_t on msvc). While this exists with the intent to replace char with char8_t in UTF-8 contexts, that would be a very wide-reaching change which would necessitate a more granular approach. Instead, this simply extends character-type support introduced in the above two PRs to include char8_t.

Footnotes

  1. https://en.cppreference.com/w/cpp/language/types#char8_t

@timothyqiu
Copy link
Member

Despite being a C++20 feature, this type can be safely backported to C++17 using -fchar8_t (/Zc:char8_t on msvc).

Relying on compiler-specific expansions sounds hacky to me.

@lawnjelly
Copy link
Member

lawnjelly commented Oct 24, 2024

What's the advantage of using char8_t over specific types uint8_t and int8_t?

https://stackoverflow.com/questions/57402464/is-c20-char8-t-the-same-as-our-old-char
https://www.think-cell.com/en/career/devblog/char8_t-was-a-bad-idea

EDIT:
It seems like char8_t is unsigned, but the bit depth may not be guaranteed.

Personally I am not convinced by use of non fixed width types in modern code (and programmers have been moving away from these types, most use strong types e.g. s32, u32 etc these days, some still use weak types in local variables).

The advantage of using strong types is that you write the code once, and it will run the same anywhere, with different compilers, different platforms. Only thing to watch for is alignment issues (e.g. bus errors on android) and performance penalties for non-alignment.

With weak types you can easily write code that works on your machine but has hard to diagnose bugs on other platforms / compilers, and of course any serialized files using these types may not be compatible, and also any packets sent over internet etc.

@Repiteo
Copy link
Contributor Author

Repiteo commented Oct 24, 2024

Hmm, that's all very fair. Compounded with local testing showing that intellisense won't always play nice with char8_t on C++17, a backport implementation doesn't feel like the way to go.

I'll instead integrate barebones char8_t support into my C++20 PR1, where it can be more sensibly wrapped. Closing.

Footnotes

  1. Core: Support c++20 compilation #89660

@Repiteo Repiteo closed this Oct 24, 2024
@Repiteo Repiteo deleted the core/char8_t branch October 24, 2024 14:09
@Repiteo Repiteo removed this from the 4.x milestone Oct 24, 2024
@Repiteo Repiteo removed request for a team October 24, 2024 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants