Fuse contiguous and layout with pointwise #2889

umangyadav · 2024-03-15T14:24:45Z

rocBLAS can not handle batch dimensions that are not collapsible.
Because of that it requires contiguous operations for the inputs. eliminate_contiguous pass will not remove such contiguous.
If such input is coming from pointwise op then contiguous can be fused with pointwise.

codecov · 2024-03-15T15:53:16Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.84%. Comparing base (21b71c6) to head (d2eaba6).

Additional details and impacted files

@@           Coverage Diff            @@
##           develop    #2889   +/-   ##
========================================
  Coverage    91.84%   91.84%           
========================================
  Files          478      478           
  Lines        18179    18179           
========================================
  Hits         16696    16696           
  Misses        1483     1483

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

migraphx-bot · 2024-03-15T18:43:37Z

Test	Batch	Rate new d2eaba	Rate old 5032ef	Diff	Compare
torchvision-resnet50	64	3,056.50	3,058.75	-0.07%	✅
torchvision-resnet50_fp16	64	7,111.00	7,127.23	-0.23%	✅
torchvision-densenet121	32	2,449.89	2,451.06	-0.05%	✅
torchvision-densenet121_fp16	32	4,112.70	4,109.97	0.07%	✅
torchvision-inceptionv3	32	1,657.14	1,659.74	-0.16%	✅
torchvision-inceptionv3_fp16	32	2,612.99	2,619.60	-0.25%	✅
cadene-inceptionv4	16	778.11	780.84	-0.35%	✅
cadene-resnext64x4	16	746.04	745.77	0.04%	✅
slim-mobilenet	64	6,718.81	6,714.58	0.06%	✅
slim-nasnetalarge	64	175.94	175.97	-0.02%	✅
slim-resnet50v2	64	2,979.29	2,981.85	-0.09%	✅
bert-mrpc-onnx	8	1,070.29	1,070.87	-0.05%	✅
bert-mrpc-tf	1	467.73	446.05	4.86%	🔆
pytorch-examples-wlang-gru	1	389.40	377.78	3.08%	🔆
pytorch-examples-wlang-lstm	1	414.51	354.57	16.90%	🔆
torchvision-resnet50_1	1	792.59	792.31	0.04%	✅
cadene-dpn92_1	1	428.20	428.08	0.03%	✅
cadene-resnext101_1	1	363.78	363.76	0.01%	✅
onnx-taau-downsample	1	348.50	349.31	-0.23%	✅
dlrm-criteoterabyte	1	34.29	34.73	-1.29%	✅
dlrm-criteoterabyte_fp16	1	58.01	57.72	0.50%	✅
agentmodel	1	6,591.44	6,816.03	-3.30%	🔴
unet_fp16	2	58.38	58.23	0.26%	✅
resnet50v1_fp16	1	957.26	985.92	-2.91%	✅
resnet50v1_int8	1	877.91	866.51	1.32%	✅
bert_base_cased_fp16	64	1,040.74	1,041.04	-0.03%	✅
bert_large_uncased_fp16	32	323.29	321.85	0.45%	✅
bert_large_fp16	1	nan	nan	nan%	❌
distilgpt2_fp16	16	2,026.24	2,026.48	-0.01%	✅
yolov5s	1	519.75	514.61	1.00%	✅
tinyllama	1	44.83	44.75	0.18%	✅
vicuna-fastchat	1	182.65	182.17	0.27%	✅
whisper-tiny-encoder	1	402.65	403.43	-0.19%	✅
whisper-tiny-decoder	1	431.58	420.63	2.60%	✅

This build is not recommended to merge 🔴

migraphx-bot · 2024-03-15T18:43:38Z

❌bert-mrpc-onnx: ERROR - check error output

Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 340, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 205, in main
model = migraphx.parse_onnx(model_name, default_dim_value=batch)
RuntimeError: /src/AMDMIGraphX/src/onnx/onnx_parser.cpp:264: parse_from: PARSE_FROM: Failed reading onnx file: /new-saved-models/huggingface-transformers/bert_mrpc1.onnx

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

❌cadene-resnext101_1: ERROR - check error output

2024-03-18 15:31:05.762634355 [W:onnxruntime:, model.cc:183 Model] ONNX Runtime only guarantees support for models stamped with opset version 7 or above for opset domain 'ai.onnx'. Please upgrade your model to opset 7 or higher. For now, this opset 6 model may run depending upon legacy support of some older opset version operators.
2024-03-18 15:31:05.769372695 [W:onnxruntime:, transpose_optimizer.cc:28 ApplyImpl] Transpose optimizer failed: Unsupported ONNX opset: 6
Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 340, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 267, in main
sess = ort.InferenceSession(model_name,
File "/usr/local/lib/python3.8/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/usr/local/lib/python3.8/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 463, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for BatchNormalization(6) node with name ''

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

❌unet: ERROR - check error output

Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 340, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 207, in main
model = migraphx.parse_onnx(model_name,
RuntimeError: /src/AMDMIGraphX/src/onnx/onnx_parser.cpp:264: parse_from: PARSE_FROM: Failed reading onnx file: /new-saved-models/unet/model.onnx

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

❌bert_large: ERROR - check error output

Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 340, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 205, in main
model = migraphx.parse_onnx(model_name, default_dim_value=batch)
RuntimeError: /src/AMDMIGraphX/src/onnx/onnx_parser.cpp:264: parse_from: PARSE_FROM: Failed reading onnx file: /new-saved-models/bert/model.onnx

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

…intwise

umangyadav added 5 commits March 14, 2024 13:59

add find_layout_pointwise

e39fda0

Fix output shape

5e45c63

rename

7d97672

add test for contiguous_pointwise

93a8be5

add test for layout too

4b43c35

umangyadav requested a review from causten as a code owner March 15, 2024 14:24

umangyadav requested review from shivadbhavsar, pfultz2 and kahmed10 and removed request for causten and shivadbhavsar March 15, 2024 14:24

umangyadav self-assigned this Mar 15, 2024

umangyadav added the Perf Improve label Mar 15, 2024

fix merge conflicts

550bf0a

umangyadav force-pushed the fuse_contiguous_pointwise branch from a739662 to 550bf0a Compare March 15, 2024 15:39

umangyadav and others added 2 commits March 15, 2024 11:41

Merge branch 'develop' into fuse_contiguous_pointwise

70a108a

merge fix

99d7f57

umangyadav added 2 commits March 15, 2024 15:56

add verify test

7135e40

Formatting

6ebbf4d

causten added the high priority A PR with high priority for review and merging. label Mar 15, 2024

umangyadav added 3 commits March 18, 2024 12:33

Merge remote-tracking branch 'origin/develop' into fuse_contiguous_po…

3a5943e

…intwise

fix layernorm pointwise output shape

439eeb3

Fix unit-test

d2eaba6

pfultz2 approved these changes Mar 18, 2024

View reviewed changes

causten approved these changes Mar 18, 2024

View reviewed changes

causten merged commit 17ad55e into develop Mar 18, 2024
48 checks passed

causten deleted the fuse_contiguous_pointwise branch March 18, 2024 21:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fuse contiguous and layout with pointwise #2889

Fuse contiguous and layout with pointwise #2889

umangyadav commented Mar 15, 2024 •

edited

Loading

codecov bot commented Mar 15, 2024 •

edited

Loading

migraphx-bot commented Mar 15, 2024 •

edited

Loading

migraphx-bot commented Mar 15, 2024 •

edited

Loading

Fuse contiguous and layout with pointwise #2889

Fuse contiguous and layout with pointwise #2889

Conversation

umangyadav commented Mar 15, 2024 • edited Loading

codecov bot commented Mar 15, 2024 • edited Loading

Codecov Report

migraphx-bot commented Mar 15, 2024 • edited Loading

migraphx-bot commented Mar 15, 2024 • edited Loading

umangyadav commented Mar 15, 2024 •

edited

Loading

codecov bot commented Mar 15, 2024 •

edited

Loading

migraphx-bot commented Mar 15, 2024 •

edited

Loading

migraphx-bot commented Mar 15, 2024 •

edited

Loading