-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Epilogue Pipeline for PVC using EVT #80
Add Epilogue Pipeline for PVC using EVT #80
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
23ddf3c
to
3755c7c
Compare
* Next need to integrate EVT
* Alpha scaling working, need to fix copy atom for C
* Need to remove register spill * Need to reduce error margin
3755c7c
to
8d1f87b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
class CopyOpR2S_ | ||
> | ||
class CollectiveEpilogue< | ||
IntelPVCEpilogue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is anything PVC specific? Might IntelXeEpilogue
might be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we should check that. We are using PVC in some places and Xe in others, but I'm not entirely sure we are naming things correctly. I'll address this in a separate PR.
This PR introduces the Epilogue implementation for PVC using the Epilogue Visitor Tree available for SM90 (and onwards) GPUs for NVIDIA. We only support fusion::LinearCombination operation for PVC i.e. D = alpha * A * B + beta * C through this PR, but it can be extended further to add other fusion operations by partial specialization of the FusionCallBacks struct available in the include/cutlass/epilogue/fusion/intel_pvc_callbacks.hpp file. --------- Co-authored-by: Alejandro Acosta <[email protected]>
This PR introduces the Epilogue implementation for PVC using the Epilogue Visitor Tree available for SM90 (and onwards) GPUs for NVIDIA. We only support fusion::LinearCombination operation for PVC i.e. D = alpha * A * B + beta * C through this PR, but it can be extended further to add other fusion operations by partial specialization of the FusionCallBacks struct available in the include/cutlass/epilogue/fusion/intel_pvc_callbacks.hpp file. --------- Co-authored-by: Alejandro Acosta <[email protected]>
This PR introduces the Epilogue implementation for PVC using the Epilogue Visitor Tree available for SM90 (and onwards) GPUs for NVIDIA. We only support fusion::LinearCombination operation for PVC i.e. D = alpha * A * B + beta * C through this PR, but it can be extended further to add other fusion operations by partial specialization of the FusionCallBacks struct available in the include/cutlass/epilogue/fusion/intel_pvc_callbacks.hpp file. --------- Co-authored-by: Alejandro Acosta <[email protected]>
This PR introduces the Epilogue implementation for PVC using the Epilogue Visitor Tree available for SM90 (and onwards) GPUs for NVIDIA. We only support
fusion::LinearCombination
operation for PVC i.e.D = alpha * A * B + beta * C
through this PR, but it can be extended further to add otherfusion
operations by partial specialization of theFusionCallBacks
struct available in theinclude/cutlass/epilogue/fusion/intel_pvc_callbacks.hpp
file.