Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable simplify qdq to work with FP8 types and fix bug in pass #2528

Merged
merged 1 commit into from
Dec 7, 2023

Conversation

umangyadav
Copy link
Member

@umangyadav umangyadav commented Dec 7, 2023

match_find_quantizable_ops assumed dequantizelinear only has one use. it fails when it has multiple uses.

This PR fixes that issue and adds FP8 dtypes as allowed quantized type in simplify_qdq pass.

This set of changes helps running FP8 quantized model along with changes in #2506 .

@umangyadav umangyadav self-assigned this Dec 7, 2023
@umangyadav umangyadav added the FP8 issues related to FP8 implemenation label Dec 7, 2023
@migraphx-bot
Copy link
Collaborator

Test Batch Rate new
95616a
Rate old
a09dc5
Diff Compare
torchvision-resnet50 64 2,834.92 2,833.31 0.06%
torchvision-resnet50_fp16 64 6,502.95 6,501.92 0.02%
torchvision-densenet121 32 2,072.36 2,094.20 -1.04%
torchvision-densenet121_fp16 32 3,667.31 3,665.08 0.06%
torchvision-inceptionv3 32 1,598.30 1,594.40 0.24%
torchvision-inceptionv3_fp16 32 2,559.88 2,557.61 0.09%
cadene-inceptionv4 16 722.37 721.79 0.08%
cadene-resnext64x4 16 692.35 692.75 -0.06%
slim-mobilenet 64 8,322.56 8,332.19 -0.12%
slim-nasnetalarge 64 230.55 230.63 -0.03%
slim-resnet50v2 64 2,665.09 2,664.91 0.01%
bert-mrpc-onnx 8 823.76 823.07 0.08%
bert-mrpc-tf 1 388.65 389.27 -0.16%
pytorch-examples-wlang-gru 1 297.92 299.82 -0.63%
pytorch-examples-wlang-lstm 1 313.30 308.52 1.55%
torchvision-resnet50_1 1 608.00 602.66 0.89%
torchvision-inceptionv3_1 1 342.26 343.14 -0.26%
cadene-dpn92_1 1 402.28 404.59 -0.57%
cadene-resnext101_1 1 328.33 328.32 0.00%
slim-vgg16_1 1 459.12 460.67 -0.34%
slim-mobilenet_1 1 2,112.58 2,099.38 0.63%
slim-inceptionv4_1 1 214.86 213.62 0.58%
onnx-taau-downsample 1 305.53 304.86 0.22%
dlrm-criteoterabyte 1 21.59 21.62 -0.14%
dlrm-criteoterabyte_fp16 1 40.63 40.65 -0.06%
agentmodel 1 5,652.14 5,935.95 -4.78% 🔴
unet_fp16 2 54.79 54.68 0.20%
resnet50v1_fp16 1 943.90 932.98 1.17%
bert_base_cased_fp16 64 903.60 903.11 0.06%
bert_large_uncased_fp16 32 285.61 285.63 -0.00%
bert_large_fp16 1 166.63 166.70 -0.04%
distilgpt2_fp16 16 1,281.23 1,281.25 -0.00%

This build is not recommended to merge 🔴

@migraphx-bot
Copy link
Collaborator


     ✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

     ✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

     ✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

     ✅ torchvision-inceptionv3_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

     ✅ slim-vgg16_1: PASSED: MIGraphX meets tolerance

     ✅ slim-mobilenet_1: PASSED: MIGraphX meets tolerance

     ✅ slim-inceptionv4_1: PASSED: MIGraphX meets tolerance

     ✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

     ✅ agentmodel: PASSED: MIGraphX meets tolerance

     ✅ unet: PASSED: MIGraphX meets tolerance

     ✅ resnet50v1: PASSED: MIGraphX meets tolerance

🔴bert_base_cased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ bert_large_uncased_fp16: PASSED: MIGraphX meets tolerance

     ✅ bert_large: PASSED: MIGraphX meets tolerance

🔴distilgpt2_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

@causten causten merged commit 22f60c2 into develop Dec 7, 2023
34 of 41 checks passed
@causten causten deleted the qdq_fp8 branch December 8, 2023 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FP8 issues related to FP8 implemenation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants