Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flag to accept non-uniform WG sizes #2167

Merged
merged 24 commits into from
Oct 19, 2023
Merged

Add flag to accept non-uniform WG sizes #2167

merged 24 commits into from
Oct 19, 2023

Conversation

umangyadav
Copy link
Member

@umangyadav umangyadav commented Sep 11, 2023

fixes #1986
Depends on #2168
Relevant issues:
ROCm/MIOpen#2307

ROCm/composable_kernel#838

@umangyadav umangyadav requested a review from pfultz2 September 11, 2023 21:24
@umangyadav umangyadav self-assigned this Sep 11, 2023
@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Sep 11, 2023

Test Batch Rate new
188cf9
Rate old
e8e142
Diff Compare
torchvision-resnet50 64 2,849.69 2,850.87 -0.04%
torchvision-resnet50_fp16 64 6,475.69 6,483.76 -0.12%
torchvision-densenet121 32 2,101.46 2,107.54 -0.29%
torchvision-densenet121_fp16 32 3,673.82 3,673.03 0.02%
torchvision-inceptionv3 32 1,592.32 1,594.82 -0.16%
torchvision-inceptionv3_fp16 32 2,585.25 2,592.86 -0.29%
cadene-inceptionv4 16 706.98 707.31 -0.05%
cadene-resnext64x4 16 697.85 697.36 0.07%
slim-mobilenet 64 8,354.93 8,344.55 0.12%
slim-nasnetalarge 64 226.73 226.95 -0.10%
slim-resnet50v2 64 2,678.81 2,676.49 0.09%
bert-mrpc-onnx 8 824.56 824.50 0.01%
bert-mrpc-tf 1 387.73 388.61 -0.23%
pytorch-examples-wlang-gru 1 298.53 299.61 -0.36%
pytorch-examples-wlang-lstm 1 316.09 316.43 -0.11%
torchvision-resnet50_1 1 600.12 597.52 0.44%
torchvision-inceptionv3_1 1 334.50 340.06 -1.63%
cadene-dpn92_1 1 395.91 399.86 -0.99%
cadene-resnext101_1 1 325.36 329.54 -1.27%
slim-vgg16_1 1 463.71 463.53 0.04%
slim-mobilenet_1 1 2,069.61 2,046.07 1.15%
slim-inceptionv4_1 1 214.57 214.62 -0.02%
onnx-taau-downsample 1 306.09 307.04 -0.31%
dlrm-criteoterabyte 1 21.70 21.71 -0.05%
dlrm-criteoterabyte_fp16 1 40.74 40.69 0.13%
agentmodel 1 5,782.02 5,783.52 -0.03%
unet_fp16 2 55.95 55.96 -0.01%
resnet50v1_fp16 1 943.66 948.45 -0.50%
bert_base_cased_fp16 64 970.15 970.32 -0.02%
bert_large_uncased_fp16 32 304.83 304.83 0.00%
bert_large_fp16 1 166.78 166.60 0.11%
distilgpt2_fp16 16 1,278.65 1,279.20 -0.04%

This build is OK for merge ✅

@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Sep 11, 2023


    :white_check_mark:bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

    :white_check_mark:bert-mrpc-tf: PASSED: MIGraphX meets tolerance

    :white_check_mark:pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

    :white_check_mark:pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

    :white_check_mark:torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

🔴torchvision-inceptionv3_1: FAILED: MIGraphX is not within tolerance - check verbose output


    :white_check_mark:cadene-dpn92_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:cadene-resnext101_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:slim-vgg16_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:slim-mobilenet_1: PASSED: MIGraphX meets tolerance

🔴slim-inceptionv4_1: FAILED: MIGraphX is not within tolerance - check verbose output


    :white_check_mark:dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

    :white_check_mark:agentmodel: PASSED: MIGraphX meets tolerance

    :white_check_mark:unet: PASSED: MIGraphX meets tolerance

    :white_check_mark:resnet50v1: PASSED: MIGraphX meets tolerance

🔴bert_base_cased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


    :white_check_mark:bert_large: PASSED: MIGraphX meets tolerance

🔴distilgpt2_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

@codecov
Copy link

codecov bot commented Sep 12, 2023

Codecov Report

Merging #2167 (7803850) into develop (c399062) will not change coverage.
Report is 1 commits behind head on develop.
The diff coverage is n/a.

❗ Current head 7803850 differs from pull request most recent head 188cf94. Consider uploading reports for the commit 188cf94 to get more accurate results

@@           Coverage Diff            @@
##           develop    #2167   +/-   ##
========================================
  Coverage    91.36%   91.36%           
========================================
  Files          439      439           
  Lines        16493    16493           
========================================
  Hits         15069    15069           
  Misses        1424     1424           

if(options.global % options.local != 0 and hip_accept_non_uniform_wg())
options.params += " -fno-offload-uniform-block";
else
assert(options.global % options.local == 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will throw an assert on older rocm that doesnt support the flag. If the flag is not supported than most likely non-uniform blocks is supported by default. I think we can remove this assert.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if flag is not supported then it would have calculated the global WG multiple of local WG. It would work with older rocms as well.
#1976

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the flag is not supported than most likely non-uniform blocks is supported by default

That isn't the case with 5.7. Neither the Flag nor non-uniform blocks are supported.

@umangyadav umangyadav requested a review from pfultz2 October 16, 2023 23:34
@umangyadav umangyadav requested a review from pfultz2 October 17, 2023 20:17
@pfultz2
Copy link
Collaborator

pfultz2 commented Oct 17, 2023

@umangyadav Do we have a verify test that uses the non-uniform blocks?

@umangyadav
Copy link
Member Author

@umangyadav Do we have a verify test that uses the non-uniform blocks?

yes test_slice2 and roialign tests.

@TedThemistokleous TedThemistokleous added bugfix Fixes a bug found in the code. enhancement New feature or request labels Oct 18, 2023
@causten causten merged commit 581b1b5 into develop Oct 19, 2023
15 checks passed
@causten causten deleted the add_fno_uniform branch October 19, 2023 14:25
@atamazov
Copy link

atamazov commented Mar 1, 2024

@umangyadav Is seems like you have some code that checks if compiler supports some option at runtime. FYI this is how we use check_cxx_compiler_flag() to check the same at build time: ROCm/MIOpen#2741

@umangyadav
Copy link
Member Author

we want to check at runtime that's why it is done that way. OTherwise we can also use check_cxx_compiler_flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Fixes a bug found in the code. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add -fno-hip-uniform-block flag for the compilation when necessary
6 participants