Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Padding in layout of std::string with allocator #718

Open
arichardson opened this issue Sep 11, 2023 · 4 comments
Open

Padding in layout of std::string with allocator #718

arichardson opened this issue Sep 11, 2023 · 4 comments
Labels
enhancement New feature or request libc++ question Further information is requested

Comments

@arichardson
Copy link
Member

Looking at libcxx/test/libcxx/strings/basic.string/sizeof.compile.pass.cpp, it turns out that using a std::string with a custom allocator and 32-bit size_type results in padding that cannot be reused by the allocator via __compressed_pair.

The fix would be to store the pointer member before the size_type members in __long, which is what the _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT does. However, we still use the v1 ABI so this is not enabled by default.

Should we enable _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT for purecap on the next breaking ABI change?

@arichardson arichardson added enhancement New feature or request question Further information is requested libc++ labels Sep 11, 2023
arichardson added a commit that referenced this issue Sep 11, 2023
The current layout is somewhat inefficient with an allocator:
#718
arichardson added a commit that referenced this issue Sep 12, 2023
The current layout is somewhat inefficient with an allocator:
#718
arichardson added a commit to CTSRD-CHERI/libcxx that referenced this issue Sep 12, 2023
The current layout is somewhat inefficient with an allocator:
CTSRD-CHERI/llvm-project#718
@bsdjhb
Copy link
Contributor

bsdjhb commented Sep 22, 2023

Are there any other similar knobs for C++'s ABI that we should also consider enabling for purecap? It's effectively a new architecture in LLVM and CheriBSD, so it doesn't (yet) need backwards ABI compat, so it would make sense for purecap arches to start with with the current ideal ABI for libc++.

@arichardson
Copy link
Member Author

We could set _LIBCPP_ABI_VERSION=2. However, that is the current unstable ABI which means future libc++ updates could break the ABI. Hopefully they decide to fix version 2 at some point soon...

@arichardson
Copy link
Member Author

Here is the current list of ABI changes in v2:

// Change short string representation so that string data starts at offset 0,
// improving its alignment in some cases.
#    define _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT
// Fix deque iterator type in order to support incomplete types.
#    define _LIBCPP_ABI_INCOMPLETE_TYPES_IN_DEQUE
// Fix undefined behavior in how std::list stores its linked nodes.
#    define _LIBCPP_ABI_LIST_REMOVE_NODE_POINTER_UB
// Fix undefined behavior in  how __tree stores its end and parent nodes.
#    define _LIBCPP_ABI_TREE_REMOVE_NODE_POINTER_UB
// Fix undefined behavior in how __hash_table stores its pointer types.
#    define _LIBCPP_ABI_FIX_UNORDERED_NODE_POINTER_UB
#    define _LIBCPP_ABI_FORWARD_LIST_REMOVE_NODE_POINTER_UB
#    define _LIBCPP_ABI_FIX_UNORDERED_CONTAINER_SIZE_TYPE
// Define a key function for `bad_function_call` in the library, to centralize
// its vtable and typeinfo to libc++ rather than having all other libraries
// using that class define their own copies.
#    define _LIBCPP_ABI_BAD_FUNCTION_CALL_KEY_FUNCTION
// Override the default return value of exception::what() for
// bad_function_call::what() with a string that is specific to
// bad_function_call (see http://wg21.link/LWG2233). This is an ABI break
// because it changes the vtable layout of bad_function_call.
#    define _LIBCPP_ABI_BAD_FUNCTION_CALL_GOOD_WHAT_MESSAGE
// Enable optimized version of __do_get_(un)signed which avoids redundant copies.
#    define _LIBCPP_ABI_OPTIMIZED_LOCALE_NUM_GET
// Give reverse_iterator<T> one data member of type T, not two.
// Also, in C++17 and later, don't derive iterator types from std::iterator.
#    define _LIBCPP_ABI_NO_ITERATOR_BASES
// Use the smallest possible integer type to represent the index of the variant.
// Previously libc++ used "unsigned int" exclusively.
#    define _LIBCPP_ABI_VARIANT_INDEX_TYPE_OPTIMIZATION
// Unstable attempt to provide a more optimized std::function
#    define _LIBCPP_ABI_OPTIMIZED_FUNCTION
// All the regex constants must be distinct and nonzero.
#    define _LIBCPP_ABI_REGEX_CONSTANTS_NONZERO
// Re-worked external template instantiations for std::string with a focus on
// performance and fast-path inlining.
#    define _LIBCPP_ABI_STRING_OPTIMIZED_EXTERNAL_INSTANTIATION
// Enable clang::trivial_abi on std::unique_ptr.
#    define _LIBCPP_ABI_ENABLE_UNIQUE_PTR_TRIVIAL_ABI
// Enable clang::trivial_abi on std::shared_ptr and std::weak_ptr
#    define _LIBCPP_ABI_ENABLE_SHARED_PTR_TRIVIAL_ABI
// std::random_device holds some state when it uses an implementation that gets
// entropy from a file (see _LIBCPP_USING_DEV_RANDOM). When switching from this
// implementation to another one on a platform that has already shipped
// std::random_device, one needs to retain the same object layout to remain ABI
// compatible. This switch removes these workarounds for platforms that don't care
// about ABI compatibility.
#    define _LIBCPP_ABI_NO_RANDOM_DEVICE_COMPATIBILITY_LAYOUT
// Don't export the legacy __basic_string_common class and its methods from the built library.
#    define _LIBCPP_ABI_DO_NOT_EXPORT_BASIC_STRING_COMMON
// Don't export the legacy __vector_base_common class and its methods from the built library.
#    define _LIBCPP_ABI_DO_NOT_EXPORT_VECTOR_BASE_COMMON
// According to the Standard, `bitset::operator[] const` returns bool
#    define _LIBCPP_ABI_BITSET_VECTOR_BOOL_CONST_SUBSCRIPT_RETURN_BOOL
// Remove the base 10 implementation of std::to_chars from the dylib.
// The implementation moved to the header, but we still export the symbols from
// the dylib for backwards compatibility.
#    define _LIBCPP_ABI_DO_NOT_EXPORT_TO_CHARS_BASE_10

@bsdjhb
Copy link
Contributor

bsdjhb commented Sep 23, 2023

If we do that, do all the headers get installed to /usr/include/c++/v2? If so, that might be a bit more invasive of a change to deal with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request libc++ question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants