Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix joint_matrix implementation to match latest api #491

Conversation

muhammad-tanvir-1211
Copy link
Collaborator

@muhammad-tanvir-1211 muhammad-tanvir-1211 commented Jan 5, 2024

This PR updates the joint_matrix based GEMM implementation to match the api from intel/llvm (commit # 2a828f49283145433dc9bbbff74cefcb2d2b10dc).

  • Removed the deprecated get_wi_data() calls and replaced them with joint_matrix_apply() and joint_matrix_copy() calls.
  • Changed the leading dimensions for shared memory accesses to avoid excessive bank conflicts.
  • Refactored the code to write output data to Global Memory in a coalesced way (this wasn't possible earlier due to the joint_matrix_store() operation performed on the Global pointer C)
  • Fixed race condition for batch size > 1.

Copy link
Collaborator

@OuadiElfarouki OuadiElfarouki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@muhammad-tanvir-1211 muhammad-tanvir-1211 force-pushed the joint_matrix_fix branch 3 times, most recently from ba3ca46 to 7985cbd Compare February 1, 2024 14:22
test/blas_test.hpp Outdated Show resolved Hide resolved
test/blas_test.hpp Outdated Show resolved Hide resolved
* Fix half tests by changing the initialization values
* Reduced the no. of tests executed
* Increase error margin
@muhammad-tanvir-1211 muhammad-tanvir-1211 merged commit 861b310 into codeplaysoftware:master Feb 28, 2024
3 checks passed
@muhammad-tanvir-1211 muhammad-tanvir-1211 deleted the joint_matrix_fix branch April 11, 2024 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants