Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thunks: Analyze data layout to detect platform differences #3156

Merged

Conversation

neobrain
Copy link
Member

@neobrain neobrain commented Sep 25, 2023

Overview

FEX's thunking currently assumes that parameters passed across architecture boundaries use the same data layout on both sides. For example, a function of the form void CreateObject(ConfigType* config) can only be handled by FEX if ConfigType has the same members (with the same bit sizes) on both x86 and on ARM. This is generally a safe assumption when thunking 64-bit guest applications, but it fully falls apart for 32-bit guests (since pointer members of any structure will be 32-bits instead of 64-bit). However even on 64-bit, it's desirable to have some assurance that we're indeed operating on consistent data.

This PR adds a data layout analysis framework as the first step towards solving these problems. The framework is integrated as a preprocessing pass in the thunk generator. This new pass performs these tasks:

  • it gathers all argument types used in any thunked functions
  • it queries their data layout from libclang on both x86 and ARM to build an internal type name -> data layout map for each architecture
  • finally, it compares the two maps to detect any potential differences

For simple libraries, the data layout analysis will detect a perfect match such that no additional work is needed. For more complex libraries, future PRs will add annotations to the interface files that allow to resolve data type differences semi-automatically via "struct repacking". Since most of our current thunk libraries fall in the latter category, analysis is only enabled for testing.

Implementation details

Since data layout is queried from libclang, we must run a large chunk of logic twice to get accurate data for both the guest and the host architecture. The thunk generator will first parse the thunk interface file and query data layouts for the guest architecture, after which it repeats the same for the host architecture.

It's easy to see that this gets very intertwined quickly, especially if we want a testable design.

To facilitate separation of concerns, a class-hierarchy approach is used for the implementation hence. #3067 already initiated this by splitting interface parsing and boilerplate code emission into an AnalysisAction and a GenerateThunkLibsAction, respectively. The new functionality affects AnalysisAction and adds a number of new modules on top, illustrated in this diagram:

classDiagram
    AnalysisAction
    AnalysisAction <|-- AnalyzeDataLayoutAction
    AnalysisAction <|-- DataLayoutCompareAction
    DataLayoutCompareAction <|-- DataLayoutCompareActionForTest
    DataLayoutCompareAction <|-- GenerateThunkLibsAction
  
    class AnalysisAction {
        - Parses interface files
        - Analyzes data layout factoring in annotations
    }
    class AnalyzeDataLayoutAction {
        - Runs AnalysisAction for the guest

        - Turns any libclang objects into strings so they
        can be used across sessions
    }
    class DataLayoutCompareAction {
        - Runs AnalysisAction for the host
        - Takes the output of an AnalyzeDataLayoutAction
        for the guest arch and detects any differences
        between the guest/host results
    }
    class DataLayoutCompareActionForTest {
        - Thin layer around DataLayoutCompareAction
        used for unit-testing data layout comparison.
        - Mainly deals with lifetime properties
        of libclang objects
    }
    class GenerateThunkLibsAction {
        - Emits thunk boilerplate, factoring in results
        of the DataLayoutCompareAction
    }
Loading

Impact

This feature is currently only enabled for unit tests. To enable it for real libraries, new annotations must be added and applied to the corresponding interface files. This has been done in the preview change set in #3135, which supports all of our existing thunk libraries.

These annotations are planned to be added on a per-type basis, so that we won't need to go through each function individually. That helps cut down the work involved tremendously.

Copy link
Member Author

@neobrain neobrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some notes for personal reference.

unittests/ThunkLibs/common.h Outdated Show resolved Hide resolved
ThunkLibs/include/common/GeneratorInterface.h Outdated Show resolved Hide resolved
Comment on lines +55 to +57
if (BITNESS EQUAL 32)
target_compile_definitions(${NAME}-guest-deps INTERFACE IS_32BIT_THUNK)
endif ()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes related to 32-bit support should probably be moved to a future PR, too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is needed for libfex_thunk_test, which we're already testing the 32-bit build of. Not including this here would require adding some temporary workaround that would be removed one PR later anyway.

ThunkLibs/Generator/data_layout.h Outdated Show resolved Hide resolved
ThunkLibs/Generator/data_layout.h Outdated Show resolved Hide resolved
@neobrain neobrain force-pushed the feature_thunk_data_layout_analysis branch from 79c3e22 to cf2b571 Compare September 26, 2023 09:19
@neobrain neobrain force-pushed the feature_thunk_data_layout_analysis branch from cf2b571 to 508cdf1 Compare September 26, 2023 15:20
/**
* Run the given ToolAction on the input code.
*
* The "silent" parameter is used to suppress non-fatal diagnostics in tests that expect failure
*/
inline void run_tool(clang::tooling::ToolAction& action, std::string_view code, bool silent = false) {
inline void run_tool(clang::tooling::ToolAction& action, std::string_view code, bool silent = false, std::optional<GuestABI> guest_abi = std::nullopt) {
Copy link
Member Author

@neobrain neobrain Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libclang will leak memory in invocation.run() when the given ToolAction throws an exception. This is annoying but otherwise harmless. To allow AddressSanitizer to run cleanly anyway, we may consider defining something like this here though to disable memory leak checking for thunk unit tests:

const char* __asan_default_options() { return "detect_leaks=0"; }

@neobrain neobrain force-pushed the feature_thunk_data_layout_analysis branch from 508cdf1 to 56ee4c1 Compare September 28, 2023 20:41
Copy link
Member

@Sonicadvance1 Sonicadvance1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm liking what I'm seeing. The unit tests are great to see!

unittests/ThunkLibs/abi.cpp Outdated Show resolved Hide resolved
unittests/ThunkLibs/abi.cpp Outdated Show resolved Hide resolved
unittests/ThunkLibs/abi.cpp Outdated Show resolved Hide resolved
CHECK(action->GetTypeCompatibility("struct A") == TypeCompatibility::None);
}

if (false) // TODO: Currently fails
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curly braces around for if or fix todo I guess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking if the guest layout misses any members from the host layout would be a bit annoying, but I found an elegant way to implement the tested behavior:

The analysis pass already verified that the host layout doesn't missed any members from the guest layout (i.e. the reverse case). To extend this to a "both-way match", we can simply add a check for equal member count!

@neobrain neobrain force-pushed the feature_thunk_data_layout_analysis branch from 878d6b8 to 899d6a5 Compare October 2, 2023 20:02
@neobrain neobrain force-pushed the feature_thunk_data_layout_analysis branch from 899d6a5 to 36fedfd Compare October 2, 2023 20:03
@neobrain
Copy link
Member Author

neobrain commented Oct 2, 2023

Branch updated (see changes):

  • Minor cleanups and implemented all easy TODOs (others refer to open questions or will be addressed in follow-up PRs)
  • Now disabling ASAN leak detection for tests
  • Addressed review feedback
  • Fixed and re-enabled tests previously marked as failing

Good to merge from my end 🙂

The set of these types is tracked in AnalysisAction, to which extensive
verification logic is added to detect potential incompatibilities and to
enforce use of annotatations where needed.
This adds a ComputeDataLayout function that maps a set of clang::Types
to an internal representation of their data layout (size, member list, ...).
This runs the data layout analysis pass added in the previous change twice:
Once for the host architecture and once for the guest architecture. This
allows the new DataLayoutCompareAction to query architecture differences for
each type, which can then be used to instruct code generation accordingly.

Currently, type compatibility is classified into 3 categories:
* Fully compatible (same size/alignment for the type itself and any members)
* Repackable (incompatibility can be resolved with emission of automatable
  repacking code, e.g. when struct members are located at differing offsets
  due to padding bytes)
* Incompatible
@neobrain neobrain force-pushed the feature_thunk_data_layout_analysis branch from 36fedfd to fe681ab Compare October 2, 2023 20:19
Copy link
Member

@Sonicadvance1 Sonicadvance1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zip zap zoop, in to main it goes

@Sonicadvance1 Sonicadvance1 merged commit 48fa4f1 into FEX-Emu:main Oct 4, 2023
9 checks passed
@neobrain neobrain deleted the feature_thunk_data_layout_analysis branch October 4, 2023 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants