Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix targets registration on Windows #2866

Merged
merged 35 commits into from
Jun 6, 2024
Merged

Fix targets registration on Windows #2866

merged 35 commits into from
Jun 6, 2024

Conversation

apwojcik
Copy link
Collaborator

@apwojcik apwojcik commented Mar 7, 2024

Unfortunately, automatic target registration does not work on Windows.
The linker will not add a specified DLL as a dependency if at least a symbol from the dynamic library is not used. Therefore, linking to migrpahx_all_targets does not link dynamic libraries on Windows as intended.

This PR changes automatic registration to runtime manual registration. Registration is done when the target library is loaded by the register_target() function exported from the target library. It is called from the migraphx::make_target() function.

@apwojcik apwojcik added the Windows Related changes for Windows Environments label Mar 7, 2024
@apwojcik apwojcik requested a review from causten as a code owner March 7, 2024 15:20
@pfultz2
Copy link
Collaborator

pfultz2 commented Mar 7, 2024

Can we just use the -Wl,--no-as-needed flag instead on windows?

@pfultz2
Copy link
Collaborator

pfultz2 commented Mar 7, 2024

Also, we could make migraphx_all_targets a static library and then have it load the all targets before main:

__attribute__((constructor)) void load_targets()
{
            make_target("ref");
    #ifdef HAVE_CPU
            make_target("cpu");
    #endif
    #ifdef HAVE_GPU
            make_target("gpu");
    #endif
    #ifdef HAVE_FPGA
            make_target("fpga");
    #endif
}

@apwojcik
Copy link
Collaborator Author

apwojcik commented Mar 7, 2024

Can we just use the -Wl,--no-as-needed flag instead on windows?

It is not supported on Windows. It is only applicable to shared objects.

@apwojcik
Copy link
Collaborator Author

apwojcik commented Mar 7, 2024

Also, we could make migraphx_all_targets a static library and then have it load the all targets before main:

__attribute__((constructor)) void load_targets()
{
            make_target("ref");
    #ifdef HAVE_CPU
            make_target("cpu");
    #endif
    #ifdef HAVE_GPU
            make_target("gpu");
    #endif
    #ifdef HAVE_FPGA
            make_target("fpga");
    #endif
}

This will now work on Windows. DLLs are not the same as shared objects. Furthermore, MSVC/Clang on Windows does not recognize the constructor attribute because Windows does not support it. One must implement DllMain() to serve a similar purpose. However, that is more complicated because it is process-attached and thread-attached aware—the auto registration of a target creates thread-local storage, making it unavailable after exiting from a library loading thread. A DLL on Windows should generate and return an execution context to a caller to store the per-process state.

@apwojcik
Copy link
Collaborator Author

apwojcik commented Mar 7, 2024

I managed to simplify the PR and limit the number of required changes.

Copy link

codecov bot commented Mar 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.97%. Comparing base (354bc02) to head (165d0ec).
Report is 141 commits behind head on develop.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #2866   +/-   ##
========================================
  Coverage    91.97%   91.97%           
========================================
  Files          489      489           
  Lines        19390    19390           
========================================
  Hits         17833    17833           
  Misses        1557     1557           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Mar 8, 2024

Test Batch Rate new
165d0e
Rate old
206e0f
Diff Compare
torchvision-resnet50 64 1,741.50 1,750.99 -0.54%
torchvision-resnet50_fp16 64 4,065.27 4,085.88 -0.50%
torchvision-densenet121 32 1,458.17 1,465.87 -0.52%
torchvision-densenet121_fp16 32 2,516.42 2,524.65 -0.33%
torchvision-inceptionv3 32 885.24 890.07 -0.54%
torchvision-inceptionv3_fp16 32 1,477.42 1,483.80 -0.43%
cadene-inceptionv4 16 410.35 412.53 -0.53%
cadene-resnext64x4 16 417.51 419.70 -0.52%
slim-mobilenet 64 3,986.60 4,007.04 -0.51%
slim-nasnetalarge 64 100.54 101.00 -0.46%
slim-resnet50v2 64 1,671.94 1,681.03 -0.54%
bert-mrpc-onnx 8 612.66 615.98 -0.54%
bert-mrpc-tf 1 276.92 279.26 -0.84%
pytorch-examples-wlang-gru 1 321.22 324.15 -0.90%
pytorch-examples-wlang-lstm 1 290.98 327.92 -11.26% 🔴
torchvision-resnet50_1 1 467.81 469.45 -0.35%
cadene-dpn92_1 1 246.86 247.24 -0.15%
cadene-resnext101_1 1 203.25 203.51 -0.13%
onnx-taau-downsample 1 205.51 206.27 -0.37%
dlrm-criteoterabyte 1 22.82 22.91 -0.39%
dlrm-criteoterabyte_fp16 1 42.56 42.74 -0.43%
agentmodel 1 6,340.55 6,392.31 -0.81%
unet_fp16 2 34.04 34.21 -0.49%
resnet50v1_fp16 1 585.28 597.26 -2.01%
resnet50v1_int8 1 577.83 572.90 0.86%
bert_base_cased_fp16 64 642.42 646.37 -0.61%
bert_large_uncased_fp16 32 197.82 199.03 -0.61%
bert_large_fp16 1 116.83 117.51 -0.58%
distilgpt2_fp16 16 1,204.72 1,211.72 -0.58%
yolov5s 1 300.61 302.25 -0.54%
tinyllama 1 23.21 23.34 -0.55%
vicuna-fastchat 1 133.45 134.18 -0.54%
whisper-tiny-encoder 1 243.16 244.17 -0.41%
whisper-tiny-decoder 1 255.26 256.68 -0.55%

This build is not recommended to merge 🔴

@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Mar 8, 2024


     ✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

     ✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

     ✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

     ✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

     ✅ agentmodel: PASSED: MIGraphX meets tolerance

     ✅ unet: PASSED: MIGraphX meets tolerance

     ✅ resnet50v1: PASSED: MIGraphX meets tolerance

     ✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ bert_large: PASSED: MIGraphX meets tolerance

     ✅ yolov5s: PASSED: MIGraphX meets tolerance

     ✅ tinyllama: PASSED: MIGraphX meets tolerance

     ✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

     ✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

@pfultz2
Copy link
Collaborator

pfultz2 commented Mar 8, 2024

Furthermore, MSVC/Clang on Windows does not recognize the constructor attribute because Windows does not support it.

Sure, we can write it using standard C++ as well:

struct auto_load_targets
{
    auto_load_targets()
    {
        make_target("ref");
#ifdef HAVE_CPU
        make_target("cpu");
#endif
#ifdef HAVE_GPU
        make_target("gpu");
#endif
#ifdef HAVE_FPGA
        make_target("fpga");
#endif
    }
};

static auto load_targets = auto_load_targets();

And this does work on windows because that is how we auto register unit tests.

One must implement DllMain() to serve a similar purpose.

Why would we call DllMain? This is not run from a dll. This is a static library being linked into an executable.

@apwojcik
Copy link
Collaborator Author

apwojcik commented May 8, 2024

@pfultz2 it is done

}
};
[[maybe_unused]] static auto load_targets = auto_load_targets{};
} // namespace
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to go into a static/object library that we only link in for the tests. This prevents lazy loading the targets.

@@ -44,8 +66,6 @@ std::unordered_map<std::string, target>& target_map()
return m;
}

void register_target_init() { (void)target_map(); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont remove this, I think this was necessary to ensure targets are loaded in some cases.

@@ -66,9 +65,6 @@ struct register_target_action
}
};

template <class T>
using auto_register_target = auto_register<register_target_action, T>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont remove this, We still want to provide the option to register a target with CRTP.

@apwojcik apwojcik added the UAI label May 27, 2024
@apwojcik apwojcik requested a review from pfultz2 May 27, 2024 21:41
@apwojcik
Copy link
Collaborator Author

@causten CodeCov reposts a 0.01% test coverage decrease. However, when I look into the pull request details, it says that the patch is 100% covered. Am I reading it wrong?

@@ -33,7 +33,11 @@ if(MIGRAPHX_DISABLE_LARGE_BUFFER_TESTS)
add_compile_definitions(MIGRAPHX_DISABLE_LARGE_BUFFER_TESTS)
endif()

add_library(register_targets STATIC register_target.cpp)
target_link_libraries(register_targets PRIVATE migraphx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to link migraphx_all_targets here as well to get the macros defines such as HAVE_GPU, HAVE_CPU, etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -86,16 +84,5 @@ target make_target(const std::string& name)
return it->second;
}

std::vector<std::string> get_targets()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should not be removed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is not right; I reverted that change. I will double-check.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it.

@causten causten merged commit 102a0a4 into develop Jun 6, 2024
43 checks passed
@causten causten deleted the register_target branch June 6, 2024 18:11
causten pushed a commit that referenced this pull request Jun 26, 2024
lajagapp pushed a commit to lajagapp/AMDMIGraphX that referenced this pull request Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
UAI Windows Related changes for Windows Environments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants