Fix mod op compile issue on GPU #3086

gyulaz-htec · 2024-05-15T12:21:19Z

Lifting the failing int8, uint8 and int32 data types to float to workaround the problem described in #2268 (comment)

migraphx-bot · 2024-05-15T13:47:08Z

Test	Batch	Rate new a27482	Rate old c2167f	Diff	Compare
torchvision-resnet50	64	1,714.12	1,714.50	-0.02%	✅
torchvision-resnet50_fp16	64	3,810.36	3,812.07	-0.04%	✅
torchvision-densenet121	32	1,456.56	1,453.98	0.18%	✅
torchvision-densenet121_fp16	32	2,430.13	2,436.13	-0.25%	✅
torchvision-inceptionv3	32	883.76	883.63	0.02%	✅
torchvision-inceptionv3_fp16	32	1,382.20	1,417.34	-2.48%	✅
cadene-inceptionv4	16	408.08	407.89	0.05%	✅
cadene-resnext64x4	16	413.72	414.01	-0.07%	✅
slim-mobilenet	64	3,822.90	3,825.99	-0.08%	✅
slim-nasnetalarge	64	97.05	97.09	-0.03%	✅
slim-resnet50v2	64	1,651.52	1,651.41	0.01%	✅
bert-mrpc-onnx	8	589.51	590.42	-0.15%	✅
bert-mrpc-tf	1	288.82	280.89	2.82%	✅
pytorch-examples-wlang-gru	1	330.05	333.91	-1.16%	✅
pytorch-examples-wlang-lstm	1	264.58	305.41	-13.37%	🔴
torchvision-resnet50_1	1	434.55	452.26	-3.91%	🔴
cadene-dpn92_1	1	228.70	244.50	-6.46%	🔴
cadene-resnext101_1	1	189.05	188.04	0.54%	✅
onnx-taau-downsample	1	204.11	204.42	-0.15%	✅
dlrm-criteoterabyte	1	21.85	22.30	-2.01%	✅
dlrm-criteoterabyte_fp16	1	40.67	41.62	-2.29%	✅
agentmodel	1	7,250.47	5,802.65	24.95%	🔆
unet_fp16	2	33.74	34.27	-1.54%	✅
resnet50v1_fp16	1	569.50	553.42	2.91%	✅
resnet50v1_int8	1	464.01	463.00	0.22%	✅
bert_base_cased_fp16	64	620.80	623.70	-0.46%	✅
bert_large_uncased_fp16	32	193.81	194.38	-0.29%	✅
bert_large_fp16	1	103.96	104.06	-0.09%	✅
distilgpt2_fp16	16	1,188.43	1,192.82	-0.37%	✅
yolov5s	1	297.35	298.59	-0.41%	✅
tinyllama	1	23.34	23.34	-0.00%	✅
vicuna-fastchat	1	132.18	133.58	-1.06%	✅
whisper-tiny-encoder	1	240.92	240.98	-0.03%	✅
whisper-tiny-decoder	1	245.83	245.92	-0.04%	✅

This build is not recommended to merge 🔴

migraphx-bot · 2024-05-15T13:47:10Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

✅ bert_large: PASSED: MIGraphX meets tolerance

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

codecov · 2024-05-15T14:01:27Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.82%. Comparing base (bceef13) to head (a274820).
Report is 158 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #3086      +/-   ##
===========================================
- Coverage    91.82%   91.82%   -0.01%     
===========================================
  Files          486      486              
  Lines        18991    18990       -1     
===========================================
- Hits         17438    17437       -1     
  Misses        1553     1553

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pfultz2 · 2024-05-15T21:54:24Z

We should add a function to do the mod on the GPU correctly instead of falling back to float. Maybe something like:

template <class T, class U>
constexpr auto mod(T a, U b)
{
    if constexpr(is_integral<T>{} and is_integral<U>{})
        return a % b;
    return vec_transform(a, b)([](auto x0, auto x1) { 
        return fmod(remainder(x0, x1) + x1, x1); 
    });
}

gyulaz-htec · 2024-05-21T14:53:45Z

@umangyadav @pfultz2 I've updated the code according to what @pfultz2 mentioned.
I had to go with ((a % b) + b) % b instead of a % b to match the rounding mode of modulos to what onnx expects, because rounding for modulus in c++ is implementation dependant.

gyulaz-htec requested a review from causten as a code owner May 15, 2024 12:21

gyulaz-htec changed the title ~~Fix mod compile issue on GPU (#2268)~~ Fix mod op compile issue on GPU (#2268) May 15, 2024

gyulaz-htec changed the title ~~Fix mod op compile issue on GPU (#2268)~~ Fix mod op compile issue on GPU May 15, 2024

umangyadav approved these changes May 15, 2024

View reviewed changes

causten requested review from CharlieL7 and pfultz2 and removed request for CharlieL7 May 17, 2024 13:19

Fix mod compile issue on GPU (#2268)

d56693f

gyulaz-htec force-pushed the mod_op_fix branch from e22b847 to d56693f Compare May 21, 2024 14:46

gyulaz-htec requested a review from umangyadav May 21, 2024 14:54

umangyadav approved these changes May 21, 2024

View reviewed changes

gyulaz-htec added 2 commits May 23, 2024 10:32

Merge branch 'develop' into mod_op_fix

539c55d

Merge branch 'develop' into mod_op_fix

a274820

umangyadav mentioned this pull request May 27, 2024

mod verify test for the half_type is failing to compile. #2488

Closed

causten merged commit ca65129 into develop May 27, 2024
47 checks passed

causten deleted the mod_op_fix branch May 27, 2024 19:01

This was referenced May 28, 2024

Mod operator fails to compile with int8 and int32 inputs with GPU target #2268

Closed

Enable test_mod<migraphx::shape::half_type> verify test #3125

Merged

lajagapp pushed a commit to lajagapp/AMDMIGraphX that referenced this pull request Jul 8, 2024

Fix mod compile issue on GPU (ROCm#2268) (ROCm#3086)

30a09e2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix mod op compile issue on GPU #3086

Fix mod op compile issue on GPU #3086

gyulaz-htec commented May 15, 2024

migraphx-bot commented May 15, 2024 •

edited

Loading

migraphx-bot commented May 15, 2024 •

edited

Loading

codecov bot commented May 15, 2024 •

edited

Loading

pfultz2 commented May 15, 2024 •

edited

Loading

gyulaz-htec commented May 21, 2024 •

edited

Loading

Fix mod op compile issue on GPU #3086

Fix mod op compile issue on GPU #3086

Conversation

gyulaz-htec commented May 15, 2024

migraphx-bot commented May 15, 2024 • edited Loading

migraphx-bot commented May 15, 2024 • edited Loading

codecov bot commented May 15, 2024 • edited Loading

Codecov Report

pfultz2 commented May 15, 2024 • edited Loading

gyulaz-htec commented May 21, 2024 • edited Loading

migraphx-bot commented May 15, 2024 •

edited

Loading

migraphx-bot commented May 15, 2024 •

edited

Loading

codecov bot commented May 15, 2024 •

edited

Loading

pfultz2 commented May 15, 2024 •

edited

Loading

gyulaz-htec commented May 21, 2024 •

edited

Loading