Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qlinearadd operator #2188

Merged
merged 7 commits into from
Oct 6, 2023
Merged

qlinearadd operator #2188

merged 7 commits into from
Oct 6, 2023

Conversation

lakhinderwalia
Copy link
Contributor

No description provided.

@lakhinderwalia lakhinderwalia self-assigned this Sep 14, 2023
@lakhinderwalia lakhinderwalia marked this pull request as draft September 14, 2023 17:36
@lakhinderwalia lakhinderwalia linked an issue Sep 14, 2023 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Sep 14, 2023

Codecov Report

Merging #2188 (b50db04) into develop (dcc7b0a) will decrease coverage by 0.05%.
Report is 15 commits behind head on develop.
The diff coverage is 81.15%.

❗ Current head b50db04 differs from pull request most recent head 35214db. Consider uploading reports for the commit 35214db to get more accurate results

@@             Coverage Diff             @@
##           develop    #2188      +/-   ##
===========================================
- Coverage    91.49%   91.45%   -0.05%     
===========================================
  Files          430      433       +3     
  Lines        16129    16176      +47     
===========================================
+ Hits         14758    14794      +36     
- Misses        1371     1382      +11     
Files Coverage Δ
src/compile_src.cpp 85.00% <100.00%> (ø)
src/cpp_generator.cpp 70.83% <100.00%> (ø)
src/include/migraphx/compile_src.hpp 100.00% <100.00%> (ø)
src/include/migraphx/op/convert.hpp 100.00% <100.00%> (ø)
src/include/migraphx/op/isnan.hpp 100.00% <100.00%> (ø)
src/onnx/parse_castlike.cpp 100.00% <100.00%> (ø)
src/onnx/parse_constant_of_shape.cpp 92.85% <100.00%> (+1.19%) ⬆️
src/onnx/parse_resize.cpp 95.33% <100.00%> (+0.03%) ⬆️
src/include/migraphx/verify.hpp 53.57% <50.00%> (-0.82%) ⬇️
src/program.cpp 69.51% <0.00%> (ø)
... and 2 more

@pfultz2
Copy link
Collaborator

pfultz2 commented Sep 14, 2023

Do we need a new operator for this? I thought this could be implemented as quantizelinear+add.

@lakhinderwalia
Copy link
Contributor Author

Do we need a new operator for this? I thought this could be implemented as quantizelinear+add.

Are you suggesting that parse_quantizelinear.cpp should internally invoke quantizelinear + add instead of the new operator..? That may be one approach.. but I got the impression that a new operator was required... Thanks.

@pfultz2
Copy link
Collaborator

pfultz2 commented Sep 14, 2023

Are you suggesting that parse_quantizelinear.cpp should internally invoke quantizelinear + add instead of the new operator..?

Yes but for parse_qlinear_add.cpp(not parse_quantizelinear.cpp).

but I got the impression that a new operator was required...

The goal is to support the onnx operators but if it can be added using existing operators without too much hassle then we definitely want to do that. See parse_celu.cpp, parse_selu.cpp, parse_softsign.cpp, parse_thresholdrelu.cpp, etc.

@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Sep 15, 2023

Test Batch Rate new
d2f567
Rate old
65c37c
Diff Compare
torchvision-resnet50 64 2,292.08 2,324.63 -1.40%
torchvision-resnet50_fp16 64 5,267.58 5,357.98 -1.69%
torchvision-densenet121 32 1,838.90 1,847.88 -0.49%
torchvision-densenet121_fp16 32 3,404.08 3,409.96 -0.17%
torchvision-inceptionv3 32 1,293.68 1,296.14 -0.19%
torchvision-inceptionv3_fp16 32 2,471.69 2,534.23 -2.47%
cadene-inceptionv4 16 619.87 620.23 -0.06%
cadene-resnext64x4 16 588.63 588.45 0.03%
slim-mobilenet 64 7,207.78 7,215.95 -0.11%
slim-nasnetalarge 64 236.16 236.51 -0.15%
slim-resnet50v2 64 2,554.51 2,555.59 -0.04%
bert-mrpc-onnx 8 824.34 825.02 -0.08%
bert-mrpc-tf 1 388.35 389.38 -0.26%
pytorch-examples-wlang-gru 1 296.24 294.47 0.60%
pytorch-examples-wlang-lstm 1 311.62 307.96 1.19%
torchvision-resnet50_1 1 550.36 544.78 1.03%
torchvision-inceptionv3_1 1 307.83 305.19 0.86%
cadene-dpn92_1 1 355.92 352.21 1.05%
cadene-resnext101_1 1 219.68 220.12 -0.20%
slim-vgg16_1 1 224.38 224.29 0.04%
slim-mobilenet_1 1 1,523.76 1,497.40 1.76%
slim-inceptionv4_1 1 216.97 214.96 0.94%
onnx-taau-downsample 1 307.09 306.79 0.10%
dlrm-criteoterabyte 1 21.70 21.67 0.18%
dlrm-criteoterabyte_fp16 1 40.75 40.71 0.09%
agentmodel 1 5,841.77 5,776.01 1.14%
unet_fp16 2 55.13 55.16 -0.06%
resnet50v1_fp16 1 769.24 766.51 0.36%
bert_base_cased_fp16 64 970.47 971.03 -0.06%
bert_large_uncased_fp16 32 304.90 305.05 -0.05%
bert_large_fp16 1 166.72 166.83 -0.07%
distilgpt2_fp16 16 1,352.11 1,350.80 0.10%

This build is OK for merge ✅

@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Sep 15, 2023


    :white_check_mark:bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

    :white_check_mark:bert-mrpc-tf: PASSED: MIGraphX meets tolerance

    :white_check_mark:pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

    :white_check_mark:pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

    :white_check_mark:torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:torchvision-inceptionv3_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:cadene-dpn92_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:cadene-resnext101_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:slim-vgg16_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:slim-mobilenet_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:slim-inceptionv4_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

    :white_check_mark:agentmodel: PASSED: MIGraphX meets tolerance

    :white_check_mark:unet: PASSED: MIGraphX meets tolerance

    :white_check_mark:resnet50v1: PASSED: MIGraphX meets tolerance

🔴bert_base_cased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


    :white_check_mark:bert_large: PASSED: MIGraphX meets tolerance

🔴distilgpt2_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

@lakhinderwalia lakhinderwalia force-pushed the lw/q_linear_add branch 3 times, most recently from 46d2b4d to 4ee4d69 Compare October 2, 2023 23:48
@lakhinderwalia lakhinderwalia marked this pull request as ready for review October 2, 2023 23:54
ev_arg_fscale.visit([&](auto s) { sc_val.assign(s.begin(), s.end()); });
shape sh_scale = {shape::float_type, {sc_val.size()}};
instruction_ref bcast_scale;
if(sc_val.size() > 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should do arg_fscale->get_shape().elements() > 1.


// prep 1: broadcast scale. it can come as a scalar or a 1-D tensor.
std::vector<float> sc_val;
auto ev_arg_fscale = arg_fscale->eval();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reason to require arg_fscale to be a literal.

if(sc_val.size() > 1)
bcast_scale = info.add_instruction(
migraphx::make_op("broadcast", {{"axis", 0}, {"out_lens", in_lens}}),
info.add_literal(sh_scale, sc_val));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be arg_fscale instead of info.add_literal(...).

instruction_ref bcast_qdq_instr(const std::string& op_name,
const instruction_ref& x_in,
const instruction_ref& arg_fscale,
const instruction_ref& arg_z_pt,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass instruction_ref by value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could pull the common functionality (amongst quant operators, e.g. Qlinearconv etc.) bcast_qdq_instr into a separate file.. Rather than have 4 copies of it..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes thats a good idea. You can add it under include/migraphx/onnx/.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also would be better to rename the function to insert_linear_instructions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea, would be to use the same parser class for all these operators. The parser class can list multiple operators:

std::vector<op_desc> operators() const 
{ 
    return {{"QLinearAdd", "add"},
            {"QLinearConv", "convolution"}}; 
}

The op_desc.onnx_name will be the first name(ie QLinearAdd) and op_desc.op_name will be the second (ie add). You can see parse_reduce_op.cpp as an example:

https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/blob/develop/src/onnx/parse_reduce_op.cpp#L93

I dont know if that will be easier since it looks this might share more than just one function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked at parse_pooling.cpp as an example. Not too much commonality amongst these quant operators except the need to quantize/dequantize. They are all of different classes, with qdq as the commonality..

Copy link
Contributor Author

@lakhinderwalia lakhinderwalia Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also would be better to rename the function to insert_linear_instructions.

Documentation wise I want to give a clear idea of this specific usage. It should have a broadcast in its name, and specifically for quantize and dequantize. It isn't necessarily a general enough usage beyond that.. ; Hence the name: bcast_qdq_instr()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to another file, in the subdir you suggested. Thanks.


// prep 2: broadcast zero point. it can come as a scalar or a 1-D tensor.
std::vector<int> z_pt_val;
auto ev_arg_z_pt = arg_z_pt->eval();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. This is only used to check the size, so it would be better to use elements() instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed.

@pfultz2
Copy link
Collaborator

pfultz2 commented Oct 3, 2023

There should also be a test added to onnx_test.cpp.

@lakhinderwalia
Copy link
Contributor Author

There should also be a test added to onnx_test.cpp.

Yes, the parser tests will be added later, I don't want to chase manual graph creation (to verify parsing) until parsing is all nailed down. Thanks.

(There is already a onnx verification test that is in place)

src/onnx/parse_qlinearadd.cpp Show resolved Hide resolved
src/onnx/parse_qlinearadd.cpp Show resolved Hide resolved
src/onnx/parse_qlinearadd.cpp Show resolved Hide resolved
src/onnx/parse_qlinearadd.cpp Outdated Show resolved Hide resolved
test/onnx/verify_onnx.cpp Outdated Show resolved Hide resolved
@lakhinderwalia
Copy link
Contributor Author

There should also be a test added to onnx_test.cpp.

Done: qlinearadd_test

const auto& in_a = args[0];
const auto& in_scale_a = args[1];
const auto& in_zero_pt_a = args[2];
auto dquant_a = bcast_qdq_instr("dequantizelinear", in_a, in_scale_a, in_zero_pt_a, info);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto dquant_a = bcast_qdq_instr("dequantizelinear", in_a, in_scale_a, in_zero_pt_a, info);
auto dquant_a = info.add_common_op("dequantizelinear", {in_a, in_scale_a, in_zero_pt_a}, false);

This requires some modifications of add_common_op under onnx_parser.cpp/hpp, common.cpp/hpp, but this would encourage reuse of existing functions that aim to do the same thing. I've tested by making the convert optional in add_common_op by adding a flag to the end of the function (on by default to keep other functional calls intact). Seems like it passes both tests when trying this approach.

@pfultz2 @umangyadav feel free to chime in if this approach is cleaner or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Please note, this new api uses Broadcast and Multibroadcast. The other one doesn't use Broadcast. I am not sure if we want to make it a kitchen sink there..;

There are other Quant operators that I am adding which shall use Broadcast, and so that part isn't tested by your new patch.

Copy link
Collaborator

@pfultz2 pfultz2 Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think add_common_op will work because we compare from the right. In this case we would need to compare from the left, and that only works because it will broadcast on axis 0. For axis 1 this wouldn't work. We could extend the add_common_op so an optional axis could be passed to handle 1d tensors.

Either way we may want to address this another PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should address in another PR. Effectively we want to create reusable functions that will automatically handle broadcaasting shapes to be compatible for any operation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the common_op updates should be in another PR. We need to discuss how we want to design such an API for these cases.

@pfultz2
Copy link
Collaborator

pfultz2 commented Oct 6, 2023

LGTM, but I would like to @umangyadav to finish his review as well.

pp["B"] = migraphx::argument(b, data_b.data());
auto result = p.eval(pp).back();

std::vector<unsigned char> result_vector;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::vector<unsigned char> result_vector;
std::vector<uint8_t> result_vector;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just merge this suggestion.

@causten causten merged commit 19c8744 into develop Oct 6, 2023
14 of 15 checks passed
@causten causten deleted the lw/q_linear_add branch October 6, 2023 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support the ORT QLinearAdd and QLinearConv operators
6 participants