-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FP8 QuantDot operation #2506
FP8 QuantDot operation #2506
Conversation
test/verify/main.cpp
Outdated
"quant_dot_3args_5<migraphx::fp8::fp8e4m3fnuz, float>"}); | ||
rv.disable_test_for("gpu", | ||
{"test_conv_bn_add", | ||
// These passes on MI300 but fails on others, same issue as CPU. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will have to raise tolerance for FP8. I can't see other way around.
4da1510
to
d4a6dbd
Compare
src/targets/ref/lowering.cpp
Outdated
return static_cast<int32_t>(x); | ||
}); | ||
}); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't be converting the input here. gemm
is already using a double
accumulator. We could also convert the amat
and bmat
to double to avoid loss with the multiply as well, but this can be done inline in the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/targets/ref/lowering.cpp
Outdated
}); | ||
}); | ||
}); | ||
migemm(result, arg_0, arg_1, int32_t{1}, int32_t{0}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should just call our gemm
function instead of migemm
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good besides Paul comments.
this version of Dot/GEMM is different in the sense that, it uses FP8 dtype for the input args but computes fp32 output.
need to disable two of the verify test for the CPU backend because they are failing. Failure is because of lossy cast from (Float->fp8->float) inside the "ref" implementation while CPU backend optimizes outfloat -> fp8 -> float
converts tono-op
. Therefore results are not matching for "Ref" and "CPU".Depends on #2473