Add concat fusions #2460

pfultz2 · 2023-11-20T22:26:41Z

This will fuse the inputs to concat if they are pointwise operators.

migraphx-bot · 2023-11-21T03:06:45Z

Test	Batch	Rate new 525826	Rate old c3049e	Diff	Compare
torchvision-resnet50	64	2,835.40	nan	nan%	❌
torchvision-resnet50_fp16	64	6,502.28	6,508.68	-0.10%	✅
torchvision-densenet121	32	2,083.41	2,096.89	-0.64%	✅
torchvision-densenet121_fp16	32	3,665.74	3,663.69	0.06%	✅
torchvision-inceptionv3	32	1,598.15	1,597.69	0.03%	✅
torchvision-inceptionv3_fp16	32	2,565.21	2,570.26	-0.20%	✅
cadene-inceptionv4	16	723.06	722.53	0.07%	✅
cadene-resnext64x4	16	690.05	692.47	-0.35%	✅
slim-mobilenet	64	8,326.62	8,340.32	-0.16%	✅
slim-nasnetalarge	64	231.93	230.68	0.54%	✅
slim-resnet50v2	64	2,665.83	2,666.08	-0.01%	✅
bert-mrpc-onnx	8	813.52	812.68	0.10%	✅
bert-mrpc-tf	1	387.87	388.68	-0.21%	✅
pytorch-examples-wlang-gru	1	302.51	307.81	-1.72%	✅
pytorch-examples-wlang-lstm	1	318.59	321.36	-0.86%	✅
torchvision-resnet50_1	1	608.24	600.35	1.31%	✅
torchvision-inceptionv3_1	1	343.47	344.92	-0.42%	✅
cadene-dpn92_1	1	402.30	402.49	-0.05%	✅
cadene-resnext101_1	1	328.54	327.84	0.21%	✅
slim-vgg16_1	1	459.46	459.46	0.00%	✅
slim-mobilenet_1	1	2,178.00	2,148.27	1.38%	✅
slim-inceptionv4_1	1	213.93	214.50	-0.27%	✅
onnx-taau-downsample	1	305.39	305.58	-0.06%	✅
dlrm-criteoterabyte	1	21.62	21.58	0.15%	✅
dlrm-criteoterabyte_fp16	1	40.66	40.69	-0.06%	✅
agentmodel	1	6,037.26	6,010.48	0.45%	✅
unet_fp16	2	54.79	54.81	-0.04%	✅
resnet50v1_fp16	1	931.58	932.09	-0.05%	✅
bert_base_cased_fp16	64	924.67	924.64	0.00%	✅
bert_large_uncased_fp16	32	290.67	290.67	0.00%	✅
bert_large_fp16	1	171.81	171.98	-0.10%	✅
distilgpt2_fp16	16	1,280.85	1,281.64	-0.06%	✅

This build is not recommended to merge 🔴

migraphx-bot · 2023-11-21T03:06:47Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ torchvision-inceptionv3_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

✅ slim-vgg16_1: PASSED: MIGraphX meets tolerance

✅ slim-mobilenet_1: PASSED: MIGraphX meets tolerance

✅ slim-inceptionv4_1: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

✅ bert_large_uncased_fp16: PASSED: MIGraphX meets tolerance

✅ bert_large: PASSED: MIGraphX meets tolerance

🔴distilgpt2_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

kahmed10 · 2023-11-24T15:45:10Z

src/fuse_concat.cpp

+/*
+ * The MIT License (MIT)
+ *
+ * Copyright (c) 2015-2022 Advanced Micro Devices, Inc. All rights reserved.


Is the old license stamper script being run? The date should be 2023

kahmed10 · 2023-11-24T15:45:42Z

src/include/migraphx/fuse_concat.hpp

+/*
+ * The MIT License (MIT)
+ *
+ * Copyright (c) 2015-2022 Advanced Micro Devices, Inc. All rights reserved.


same as the other file

kahmed10 · 2023-11-24T15:46:07Z

test/fuse_concat.cpp

+/*
+ * The MIT License (MIT)
+ *
+ * Copyright (c) 2015-2022 Advanced Micro Devices, Inc. All rights reserved.


same as the other file

kahmed10 · 2023-11-24T15:46:22Z

test/verify/test_pooling_add_concat_relu.cpp

+/*
+ * The MIT License (MIT)
+ *
+ * Copyright (c) 2015-2022 Advanced Micro Devices, Inc. All rights reserved.


same as the other file

TedThemistokleous · 2023-12-21T16:58:13Z

src/targets/gpu/target.cpp

@@ -167,6 +168,8 @@ std::vector<pass> target::get_passes(migraphx::context& gctx, const compile_opti
        dead_code_elimination{},
        enable_pass(not enabled(MIGRAPHX_DISABLE_REDUCE_FUSION{}), fuse_reduce{}),
        dead_code_elimination{},
+        fuse_concat{},


Asking the question here more out of curiosity. I assume order doesn't matter here for fuse_pointwise vs fuse_concat? Is there a benefit to swapping order ever around the fuse_reduce?

It needs to run after fuse_pointwise otherwise there will be no pointwise modules to fuse with.

Is there a benefit to swapping order ever around the fuse_reduce?

I dont think so. In either case we will have two kernels that need to be run if these two passes overlap.

TedThemistokleous · 2023-12-21T20:23:01Z

Fix CI but otherwise I get the idea.

pfultz2 added 16 commits November 10, 2023 12:30

Add concat kernel for input fusions

7fa9372

Format

8ba66eb

Add fuse_concat pass

d1dc3e3

Format

5ee35bf

Fix compile errors

fb706c8

Format

c1ee4cf

Remove print

22c92a1

Fix kernel name

a987203

Fix test case

458b38b

Format

e189e5a

Fix names

2e52b18

Format

55d1bcd

Add verify test

cbfb523

Format

602924d

Unify the concat versions

5c4e15f

Format

7668ef6

pfultz2 requested review from CharlieL7 and umangyadav November 20, 2023 22:26

pfultz2 added 5 commits November 21, 2023 14:56

Add license

584b3bd

Format

32339fe

Format

f953102

Fix tidy errors

53a8997

Format

6d5a34d

kahmed10 reviewed Nov 24, 2023

View reviewed changes

Merge branch 'develop' into concat2

3ae5f9e

TedThemistokleous reviewed Dec 21, 2023

View reviewed changes

TedThemistokleous approved these changes Dec 21, 2023

View reviewed changes

pfultz2 and others added 25 commits December 22, 2023 10:10

Remove identity op

4fd54f7

Merge branch 'develop' into concat2

fd9fa33

Merge branch 'develop' into concat2

5258266

Update cpp_generator.cpp

fd14d49

Update fuse_concat.cpp

177a7a3

Update fuse_pointwise.cpp

16f6d6e

Update fuse_concat.hpp

dfc1d20

Update module.hpp

539e366

Update identity.hpp

f5824ee

Update pass_manager.hpp

b02b08b

Update program.hpp

b3d2668

Update stringutils.hpp

b36a6ca

Update module.cpp

bbc767b

Update pass_manager.cpp

a16a2bb

Update program.cpp

0f63acf

Update compile_gen.cpp

2ba5398

Update compile_ops.cpp

431f633

Update compile_gen.hpp

5903fc3

Update concat.cpp

9fe1758

Update pointwise.cpp

8e6c6c8

Update concat.hpp

f103eee

Update fuse_concat.cpp

5177e7a

Update literal.cpp

82bf7f0

Update test_pooling_add_concat_relu.cpp

fb4e149

Merge branch 'develop' into concat2

3d6e8d2

causten merged commit 7532007 into develop Jan 4, 2024
14 of 15 checks passed

causten deleted the concat2 branch January 4, 2024 16:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add concat fusions #2460

Add concat fusions #2460

pfultz2 commented Nov 20, 2023

migraphx-bot commented Nov 21, 2023 •

edited

Loading

migraphx-bot commented Nov 21, 2023 •

edited

Loading

kahmed10 Nov 24, 2023

kahmed10 Nov 24, 2023

kahmed10 Nov 24, 2023

kahmed10 Nov 24, 2023

TedThemistokleous Dec 21, 2023

pfultz2 Dec 21, 2023

TedThemistokleous commented Dec 21, 2023

Add concat fusions #2460

Add concat fusions #2460

Conversation

pfultz2 commented Nov 20, 2023

migraphx-bot commented Nov 21, 2023 • edited Loading

migraphx-bot commented Nov 21, 2023 • edited Loading

kahmed10 Nov 24, 2023

Choose a reason for hiding this comment

kahmed10 Nov 24, 2023

Choose a reason for hiding this comment

kahmed10 Nov 24, 2023

Choose a reason for hiding this comment

kahmed10 Nov 24, 2023

Choose a reason for hiding this comment

TedThemistokleous Dec 21, 2023

Choose a reason for hiding this comment

pfultz2 Dec 21, 2023

Choose a reason for hiding this comment

TedThemistokleous commented Dec 21, 2023

migraphx-bot commented Nov 21, 2023 •

edited

Loading

migraphx-bot commented Nov 21, 2023 •

edited

Loading