You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Celestia runs on different platform with different locales. While most of desktop GNU/Linux installations prefer UTF-8 locales, some other systems, e.g. older Windows, prefer pure 8-bit encodings such as CP1251. On the other hand our core code always assumes that all strings are in UTF-8, so in win32 frontend we have conversion functions here and there.
C++ 20 introduced a new primitive type: char8_t (char8_t — type for UTF-8 character representation, required to be large enough to represent any UTF-8 code unit (8 bits). It has the same size, signedness, and alignment as unsigned char (and therefore, the same size and alignment as char and signed char), but is a distinct type) and a new std::basic_string<T>/std::basic_string_view<T> specializations using char8_t: std::u8string and std::u8string_view.
So my proposal is too use this types in our core routines and use std::string and std::string_view only in frontends, i.e. there where non-UTF-8 characters can be used. Of course as we target C++17 we need to implement required types ourselves.
I think the usual convention pre-C++20 is to use std::basic_string<unsigned char, CustomTraitsType> for this purpose, which makes sense to me, then ensure we use the various workarounds detailed in P2513R4 Compatibility and Portability Fix.
Is it too early to bump the required standard to C++20?
Is it too early to bump the required standard to C++20?
Definitely. I suppose we can switch (if we need) not earlier than 2025. Personally I want only designated initializers and maybe concepts but sometimes they're too ugly (infamous requires requires) and maybe modules, but gcc doesn't support them.
I think the usual convention pre-C++20 is to use std::basic_string<unsigned char, CustomTraitsType> for this purpose, which makes sense to me, then ensure we use the various workarounds detailed in P2513R4 Compatibility and Portability Fix.
I hoped such std::basic_string<unsigned char, CustomTraitsType> can be compatible with char8_t, but it seems not, maybe in C++26 they fix all issue or make everything so bad that rewriting in Rust/Zig/Carbon will be the best solution.
But anyway it makes sense to evaluate std::basic_string<unsigned char, CustomTraitsType>, especially taking into account that char on some platforms is signed while on others it's unsigned so this may lead to weird bugs.
Celestia runs on different platform with different locales. While most of desktop GNU/Linux installations prefer UTF-8 locales, some other systems, e.g. older Windows, prefer pure 8-bit encodings such as CP1251. On the other hand our core code always assumes that all strings are in UTF-8, so in win32 frontend we have conversion functions here and there.
C++ 20 introduced a new primitive type:
char8_t
(char8_t — type for UTF-8 character representation, required to be large enough to represent any UTF-8 code unit (8 bits). It has the same size, signedness, and alignment as unsigned char (and therefore, the same size and alignment as char and signed char), but is a distinct type) and a newstd::basic_string<T>
/std::basic_string_view<T>
specializations usingchar8_t
:std::u8string
andstd::u8string_view
.So my proposal is too use this types in our core routines and use
std::string
andstd::string_view
only in frontends, i.e. there where non-UTF-8 characters can be used. Of course as we target C++17 we need to implement required types ourselves.@ajtribick @levinli303 what do you think?
The text was updated successfully, but these errors were encountered: