Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fuse inputs with mlir #3010

Merged
merged 35 commits into from
Jul 16, 2024
Merged

Fuse inputs with mlir #3010

merged 35 commits into from
Jul 16, 2024

Conversation

pfultz2
Copy link
Collaborator

@pfultz2 pfultz2 commented Apr 26, 2024

This will fuse the inputs but only when using the MIGRAPHX_ENABLE_MLIR_INPUT_FUSION env variable.

@pfultz2 pfultz2 requested a review from causten as a code owner April 26, 2024 19:33
@pfultz2 pfultz2 requested review from manupak, umangyadav, CharlieL7 and TedThemistokleous and removed request for causten, manupak, umangyadav and CharlieL7 April 26, 2024 19:38
Copy link

codecov bot commented Apr 26, 2024

Codecov Report

Attention: Patch coverage is 98.36066% with 1 line in your changes missing coverage. Please review.

Project coverage is 92.21%. Comparing base (b4c29f0) to head (dd7985f).
Report is 161 commits behind head on develop.

Files with missing lines Patch % Lines
src/module.cpp 96.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3010      +/-   ##
===========================================
- Coverage    92.21%   92.21%   -0.01%     
===========================================
  Files          493      493              
  Lines        19725    19730       +5     
===========================================
+ Hits         18190    18194       +4     
- Misses        1535     1536       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Apr 26, 2024

Test Batch Rate new
dd7985
Rate old
9bee6a
Diff Compare
torchvision-resnet50 64 1,750.65 1,741.07 0.55%
torchvision-resnet50_fp16 64 4,182.05 4,163.68 0.44%
torchvision-densenet121 32 1,469.41 1,460.87 0.59%
torchvision-densenet121_fp16 32 2,552.38 2,543.03 0.37%
torchvision-inceptionv3 32 889.12 885.97 0.35%
torchvision-inceptionv3_fp16 32 1,492.34 1,487.78 0.31%
cadene-inceptionv4 16 412.01 410.41 0.39%
cadene-resnext64x4 16 419.47 417.50 0.47%
slim-mobilenet 64 4,013.92 3,995.38 0.46%
slim-nasnetalarge 64 100.99 100.60 0.38%
slim-resnet50v2 64 1,680.21 1,672.90 0.44%
bert-mrpc-onnx 8 616.89 612.15 0.77%
bert-mrpc-tf 1 277.80 277.43 0.13%
pytorch-examples-wlang-gru 1 322.05 367.65 -12.40% 🔴
pytorch-examples-wlang-lstm 1 291.17 295.92 -1.61%
torchvision-resnet50_1 1 471.29 472.00 -0.15%
cadene-dpn92_1 1 246.73 247.15 -0.17%
cadene-resnext101_1 1 204.62 203.28 0.66%
onnx-taau-downsample 1 206.30 205.42 0.43%
dlrm-criteoterabyte 1 22.89 22.82 0.32%
dlrm-criteoterabyte_fp16 1 43.82 43.76 0.15%
agentmodel 1 6,095.22 6,124.86 -0.48%
unet_fp16 2 34.29 34.22 0.21%
resnet50v1_fp16 1 592.15 602.05 -1.64%
resnet50v1_int8 1 567.57 572.31 -0.83%
bert_base_cased_fp16 64 646.67 643.28 0.53%
bert_large_uncased_fp16 32 198.91 197.79 0.56%
bert_large_fp16 1 116.85 116.50 0.30%
distilgpt2_fp16 16 1,212.19 1,203.59 0.71%
yolov5s 1 301.49 295.96 1.87%
tinyllama 1 23.31 23.22 0.40%
vicuna-fastchat 1 133.33 132.88 0.34%
whisper-tiny-encoder 1 244.23 243.45 0.32%
whisper-tiny-decoder 1 255.71 255.37 0.14%

This build is not recommended to merge 🔴

@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Apr 26, 2024


     ✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

     ✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

     ✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

     ✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

     ✅ agentmodel: PASSED: MIGraphX meets tolerance

     ✅ unet: PASSED: MIGraphX meets tolerance

     ✅ resnet50v1: PASSED: MIGraphX meets tolerance

     ✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ bert_large: PASSED: MIGraphX meets tolerance

     ✅ yolov5s: PASSED: MIGraphX meets tolerance

     ✅ tinyllama: PASSED: MIGraphX meets tolerance

     ✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

     ✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

@@ -43,5 +46,31 @@ void sort_params(std::vector<instruction_ref>& params)
}));
}

std::vector<instruction_ref>
find_inputs(const std::unordered_map<instruction_ref, instruction_ref>& map_ins,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function needs a descriptive comment or no one else will ever be able to use it.

Copy link
Contributor

@krzysz00 krzysz00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with Brian about the need for comments and the need for filtering.

(I'm not going to block this because this isn't my neam, but)

src/targets/gpu/fuse_mlir.cpp Show resolved Hide resolved
@pfultz2 pfultz2 requested a review from a team as a code owner June 20, 2024 02:48
@@ -245,6 +245,21 @@ struct MIGRAPHX_EXPORT module
const std::vector<instruction_ref>& splits1,
const std::vector<instruction_ref>& splits2) const;

// Fuse the instruction into the module by inserting the instructions and
// parameters for any missing inputs.
std::vector<instruction_ref>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some unit-tests

Comment on lines +74 to +77
std::transform(names.begin(), names.end(), std::back_inserter(result), [](const auto& p) {
return p.second;
});
assert(not sub or result.size() == sub->get_parameter_shapes().size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If sub == nullptr you can just do early return

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If sub == nullptr you can just do early return

Early return where?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just when it starts the body of find_inputs()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then that will skip getting the parameters. Its meant to be optional. If sub is null then it will assume all parameters come from the submodule.

Copy link
Member

@umangyadav umangyadav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall but need to add unit & verify tests. I am not sure how to do verify tests with ENV flag though.

@pfultz2
Copy link
Collaborator Author

pfultz2 commented Jun 20, 2024

I am not sure how to do verify tests with ENV flag though.

We can add the ENV var to MLIR jenkins job. We already do this to enable MLIR for everything.

@umangyadav umangyadav merged commit ff81caa into develop Jul 16, 2024
44 of 46 checks passed
@umangyadav umangyadav deleted the mlir-fuse-inputs branch July 16, 2024 12:21
TedThemistokleous pushed a commit that referenced this pull request Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants