-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
qlinearadd operator #2188
qlinearadd operator #2188
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #2188 +/- ##
===========================================
- Coverage 91.49% 91.45% -0.05%
===========================================
Files 430 433 +3
Lines 16129 16176 +47
===========================================
+ Hits 14758 14794 +36
- Misses 1371 1382 +11
|
Do we need a new operator for this? I thought this could be implemented as quantizelinear+add. |
Are you suggesting that parse_quantizelinear.cpp should internally invoke quantizelinear + add instead of the new operator..? That may be one approach.. but I got the impression that a new operator was required... Thanks. |
Yes but for parse_qlinear_add.cpp(not parse_quantizelinear.cpp).
The goal is to support the onnx operators but if it can be added using existing operators without too much hassle then we definitely want to do that. See parse_celu.cpp, parse_selu.cpp, parse_softsign.cpp, parse_thresholdrelu.cpp, etc. |
This build is OK for merge ✅ |
🔴bert_base_cased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output🔴distilgpt2_fp16: FAILED: MIGraphX is not within tolerance - check verbose output |
46d2b4d
to
4ee4d69
Compare
4ee4d69
to
b795462
Compare
src/onnx/parse_qlinearadd.cpp
Outdated
ev_arg_fscale.visit([&](auto s) { sc_val.assign(s.begin(), s.end()); }); | ||
shape sh_scale = {shape::float_type, {sc_val.size()}}; | ||
instruction_ref bcast_scale; | ||
if(sc_val.size() > 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should do arg_fscale->get_shape().elements() > 1
.
src/onnx/parse_qlinearadd.cpp
Outdated
|
||
// prep 1: broadcast scale. it can come as a scalar or a 1-D tensor. | ||
std::vector<float> sc_val; | ||
auto ev_arg_fscale = arg_fscale->eval(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no reason to require arg_fscale
to be a literal.
src/onnx/parse_qlinearadd.cpp
Outdated
if(sc_val.size() > 1) | ||
bcast_scale = info.add_instruction( | ||
migraphx::make_op("broadcast", {{"axis", 0}, {"out_lens", in_lens}}), | ||
info.add_literal(sh_scale, sc_val)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be arg_fscale
instead of info.add_literal(...)
.
src/onnx/parse_qlinearadd.cpp
Outdated
instruction_ref bcast_qdq_instr(const std::string& op_name, | ||
const instruction_ref& x_in, | ||
const instruction_ref& arg_fscale, | ||
const instruction_ref& arg_z_pt, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass instruction_ref
by value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could pull the common functionality (amongst quant operators, e.g. Qlinearconv etc.) bcast_qdq_instr into a separate file.. Rather than have 4 copies of it..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes thats a good idea. You can add it under include/migraphx/onnx/
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also would be better to rename the function to insert_linear_instructions
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another idea, would be to use the same parser class for all these operators. The parser class can list multiple operators:
std::vector<op_desc> operators() const
{
return {{"QLinearAdd", "add"},
{"QLinearConv", "convolution"}};
}
The op_desc.onnx_name
will be the first name(ie QLinearAdd
) and op_desc.op_name
will be the second (ie add
). You can see parse_reduce_op.cpp as an example:
https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/blob/develop/src/onnx/parse_reduce_op.cpp#L93
I dont know if that will be easier since it looks this might share more than just one function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have looked at parse_pooling.cpp as an example. Not too much commonality amongst these quant operators except the need to quantize/dequantize. They are all of different classes, with qdq as the commonality..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also would be better to rename the function to
insert_linear_instructions
.
Documentation wise I want to give a clear idea of this specific usage. It should have a broadcast in its name, and specifically for quantize and dequantize. It isn't necessarily a general enough usage beyond that.. ; Hence the name: bcast_qdq_instr()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to another file, in the subdir you suggested. Thanks.
src/onnx/parse_qlinearadd.cpp
Outdated
|
||
// prep 2: broadcast zero point. it can come as a scalar or a 1-D tensor. | ||
std::vector<int> z_pt_val; | ||
auto ev_arg_z_pt = arg_z_pt->eval(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. This is only used to check the size, so it would be better to use elements()
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already fixed.
There should also be a test added to |
Yes, the parser tests will be added later, I don't want to chase manual graph creation (to verify parsing) until parsing is all nailed down. Thanks. (There is already a onnx verification test that is in place) |
Done: qlinearadd_test |
const auto& in_a = args[0]; | ||
const auto& in_scale_a = args[1]; | ||
const auto& in_zero_pt_a = args[2]; | ||
auto dquant_a = bcast_qdq_instr("dequantizelinear", in_a, in_scale_a, in_zero_pt_a, info); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto dquant_a = bcast_qdq_instr("dequantizelinear", in_a, in_scale_a, in_zero_pt_a, info); | |
auto dquant_a = info.add_common_op("dequantizelinear", {in_a, in_scale_a, in_zero_pt_a}, false); |
This requires some modifications of add_common_op
under onnx_parser.cpp/hpp, common.cpp/hpp
, but this would encourage reuse of existing functions that aim to do the same thing. I've tested by making the convert optional in add_common_op
by adding a flag to the end of the function (on by default to keep other functional calls intact). Seems like it passes both tests when trying this approach.
@pfultz2 @umangyadav feel free to chime in if this approach is cleaner or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Please note, this new api uses Broadcast
and Multibroadcast
. The other one doesn't use Broadcast
. I am not sure if we want to make it a kitchen sink there..;
There are other Quant operators that I am adding which shall use Broadcast, and so that part isn't tested by your new patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think add_common_op
will work because we compare from the right. In this case we would need to compare from the left, and that only works because it will broadcast on axis 0. For axis 1 this wouldn't work. We could extend the add_common_op
so an optional axis could be passed to handle 1d tensors.
Either way we may want to address this another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should address in another PR. Effectively we want to create reusable functions that will automatically handle broadcaasting shapes to be compatible for any operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the common_op updates should be in another PR. We need to discuss how we want to design such an API for these cases.
LGTM, but I would like to @umangyadav to finish his review as well. |
pp["B"] = migraphx::argument(b, data_b.data()); | ||
auto result = p.eval(pp).back(); | ||
|
||
std::vector<unsigned char> result_vector; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::vector<unsigned char> result_vector; | |
std::vector<uint8_t> result_vector; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM just merge this suggestion.
No description provided.