Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WS] add pingpong pass #24

Open
wants to merge 1 commit into
base: ws
Choose a base branch
from
Open

[WS] add pingpong pass #24

wants to merge 1 commit into from

Conversation

manman-ren
Copy link
Contributor

@manman-ren manman-ren commented Jan 16, 2025

Initial preliminary implementation for PingPong across the two consumer warp groups. This assumes the consumer warp groups execute the same code, and the pass will try to figure out cuda core region vs. tensor core region and insert named barriers to synchronize across the consumer warp groups. When one consumer warp group is in cuda core, the other should be in tensor core.

The pass is guarded with env variable ENABLE_PINGPONG.

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 16, 2025
@manman-ren manman-ren requested a review from htyu January 16, 2025 18:20
@manman-ren
Copy link
Contributor Author

@bertmaher This is the initial implementation. The logic is kind of similar to AMD pingpong but AMD pingpong does slicing/clustering/conditional barrier across two waves (two waves execute the same code). It will be great if you can try it out on small-K GEMMs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants