Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FP8 QuantDot operation #2506

Merged
merged 121 commits into from
Dec 12, 2023
Merged

FP8 QuantDot operation #2506

merged 121 commits into from
Dec 12, 2023

Conversation

umangyadav
Copy link
Member

@umangyadav umangyadav commented Dec 2, 2023

this version of Dot/GEMM is different in the sense that, it uses FP8 dtype for the input args but computes fp32 output.

need to disable two of the verify test for the CPU backend because they are failing. Failure is because of lossy cast from (Float->fp8->float) inside the "ref" implementation while CPU backend optimizes out float -> fp8 -> float converts to no-op. Therefore results are not matching for "Ref" and "CPU".

Depends on #2473

Base automatically changed from rocblas_fp8 to develop December 6, 2023 01:20
@umangyadav umangyadav requested a review from pfultz2 December 6, 2023 01:32
"quant_dot_3args_5<migraphx::fp8::fp8e4m3fnuz, float>"});
rv.disable_test_for("gpu",
{"test_conv_bn_add",
// These passes on MI300 but fails on others, same issue as CPU.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will have to raise tolerance for FP8. I can't see other way around.

return static_cast<int32_t>(x);
});
});
});
Copy link
Collaborator

@pfultz2 pfultz2 Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be converting the input here. gemm is already using a double accumulator. We could also convert the amat and bmat to double to avoid loss with the multiply as well, but this can be done inline in the function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

});
});
});
migemm(result, arg_0, arg_1, int32_t{1}, int32_t{0});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should just call our gemm function instead of migemm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@ahsan-ca ahsan-ca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good besides Paul comments.

@causten causten merged commit aac4e95 into develop Dec 12, 2023
38 of 40 checks passed
@umangyadav umangyadav deleted the quant_gemm_fp8 branch December 12, 2023 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FP8 issues related to FP8 implemenation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants