Optimal memory access for GEMM #918
Unanswered
jungpark-mlir
asked this question in
Ideas
Replies: 1 comment
-
BTW I'm not suggesting this as 6.0 item. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
From my understanding, we're loading both matA and matB into LDS and transfer into VGPRS.
This is designed for IGEMM and works fine but it's questionable if that's still the optimal solution for the pure GEMM.
It might be providing a chance for the data to be loaded in pipelined fashion but maybe unnecessary.
It might be worth trying only load one of either matA or matB onto the LDS
or just not use LDS and share data between lanes using swizzle, or DPP.
This might be a long term homework.
Beta Was this translation helpful? Give feedback.
All reactions