Optimal memory access for GEMM #918

jungpark-mlir · 2022-12-16T15:33:44Z

jungpark-mlir
Dec 16, 2022
Collaborator

From my understanding, we're loading both matA and matB into LDS and transfer into VGPRS.
This is designed for IGEMM and works fine but it's questionable if that's still the optimal solution for the pure GEMM.
It might be providing a chance for the data to be loaded in pipelined fashion but maybe unnecessary.

It might be worth trying only load one of either matA or matB onto the LDS
or just not use LDS and share data between lanes using swizzle, or DPP.

This might be a long term homework.

jungpark-mlir · 2022-12-16T15:34:53Z

jungpark-mlir
Dec 16, 2022
Collaborator Author

BTW I'm not suggesting this as 6.0 item.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimal memory access for GEMM #918

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Optimal memory access for GEMM #918

jungpark-mlir Dec 16, 2022 Collaborator

Replies: 1 comment

jungpark-mlir Dec 16, 2022 Collaborator Author

jungpark-mlir
Dec 16, 2022
Collaborator

jungpark-mlir
Dec 16, 2022
Collaborator Author