Add Qlinearconcat op #2476

gyulaz-htec · 2023-11-28T14:49:58Z

QlinearConcat operator is required for the following int8 onnx zoo models:

YOLOv3
Mask-RCNN
Inception v1
DenseNet-121
Ultra-lightweight face detection model

migraphx-bot · 2023-11-28T17:50:53Z

Test	Batch	Rate new 55db65	Rate old 7e5359	Diff	Compare
torchvision-resnet50	64	2,834.04	2,835.01	-0.03%	✅
torchvision-resnet50_fp16	64	6,500.15	6,502.27	-0.03%	✅
torchvision-densenet121	32	2,094.95	2,095.15	-0.01%	✅
torchvision-densenet121_fp16	32	3,661.24	3,666.68	-0.15%	✅
torchvision-inceptionv3	32	1,598.51	1,597.66	0.05%	✅
torchvision-inceptionv3_fp16	32	2,556.83	2,557.91	-0.04%	✅
cadene-inceptionv4	16	722.29	722.14	0.02%	✅
cadene-resnext64x4	16	692.58	692.42	0.02%	✅
slim-mobilenet	64	8,332.17	8,340.53	-0.10%	✅
slim-nasnetalarge	64	230.55	nan	nan%	❌
slim-resnet50v2	64	2,664.42	2,665.84	-0.05%	✅
bert-mrpc-onnx	8	824.41	823.89	0.06%	✅
bert-mrpc-tf	1	386.35	389.19	-0.73%	✅
pytorch-examples-wlang-gru	1	300.58	300.04	0.18%	✅
pytorch-examples-wlang-lstm	1	312.34	310.27	0.66%	✅
torchvision-resnet50_1	1	603.19	605.92	-0.45%	✅
torchvision-inceptionv3_1	1	342.02	343.11	-0.32%	✅
cadene-dpn92_1	1	401.00	401.62	-0.15%	✅
cadene-resnext101_1	1	327.79	327.69	0.03%	✅
slim-vgg16_1	1	458.13	460.40	-0.49%	✅
slim-mobilenet_1	1	2,109.22	2,070.91	1.85%	✅
slim-inceptionv4_1	1	213.91	213.25	0.31%	✅
onnx-taau-downsample	1	304.95	305.23	-0.09%	✅
dlrm-criteoterabyte	1	21.60	21.60	0.03%	✅
dlrm-criteoterabyte_fp16	1	40.63	40.64	-0.03%	✅
agentmodel	1	5,972.00	5,973.93	-0.03%	✅
unet_fp16	2	54.79	54.69	0.18%	✅
resnet50v1_fp16	1	917.95	939.01	-2.24%	✅
bert_base_cased_fp16	64	903.28	903.48	-0.02%	✅
bert_large_uncased_fp16	32	285.78	285.65	0.05%	✅
bert_large_fp16	1	166.94	166.71	0.14%	✅
distilgpt2_fp16	16	1,282.20	1,281.56	0.05%	✅

This build is not recommended to merge 🔴

migraphx-bot · 2023-11-28T17:50:55Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ torchvision-inceptionv3_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

✅ slim-vgg16_1: PASSED: MIGraphX meets tolerance

✅ slim-mobilenet_1: PASSED: MIGraphX meets tolerance

✅ slim-inceptionv4_1: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

✅ bert_large_uncased_fp16: PASSED: MIGraphX meets tolerance

✅ bert_large: PASSED: MIGraphX meets tolerance

🔴distilgpt2_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

test/onnx/gen_onnx.py

src/onnx/parse_qlinearconcat.cpp

test/onnx/verify_onnx.cpp

src/onnx/parse_qlinearconcat.cpp

lakhinderwalia

Thank you.

CharlieL7 · 2023-12-06T19:15:43Z

src/onnx/parse_qlinearconcat.cpp

+        if((args_size < 5) or ((args_size - 2) % 3 != 0))
+            MIGRAPHX_THROW("QLINEARCONCAT: missing inputs");


Aren't the inputs supposed to be tuples? As in each input with type TV is a tuple of (Tensor, Scale, ZeroPoint). The spec says (3 - inf) inputs, so that's what I would expect.

I've checked the models in the description and those inputs are parsed just a simple sequence of tensors, I can't see any tuple type for shapes there.
For MaskRCNN-int8 I can see the following inputs for QLinearConcat operator:

shape: float_type, {1}, {0} shape: uint8_type, {1}, {0} shape: uint8_type, {1000, 4}, {4, 1} shape: float_type, {1}, {0} shape: uint8_type, {1}, {0} shape: uint8_type, {1000, 4}, {4, 1} shape: float_type, {1}, {0} shape: uint8_type, {1}, {0} shape: uint8_type, {1000, 4}, {4, 1} shape: float_type, {1}, {0} shape: uint8_type, {1}, {0} shape: uint8_type, {1000, 4}, {4, 1} shape: float_type, {1}, {0} shape: uint8_type, {1}, {0} shape: uint8_type, {507, 4}, {4, 1} shape: float_type, {1}, {0} shape: uint8_type, {1}, {0}

Is there any specific parsing option for tuple_type?
ONNX Runtime also parses these inputs as a sequence of tensors:
https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/contrib_ops/cpu/quantization/qlinear_concat.cc#L18-L19

Given that the models use the operator that way, the way the code is currently makes sense to me. I also don't think there is a tuple type in ONNX. The spec is problematic however (Microsoft probably made a mistake). I would like the docstring with the spec on this operator deleted and something that reflects the code to be written instead.

Removed the docstring and added comments about the actual input tensor layout in the input checking part.

…ehaviour

gyulaz-htec requested review from pfultz2, CharlieL7 and lakhinderwalia and removed request for pfultz2 November 28, 2023 14:50

lakhinderwalia reviewed Nov 28, 2023

View reviewed changes

test/onnx/gen_onnx.py Show resolved Hide resolved

lakhinderwalia reviewed Nov 28, 2023

View reviewed changes