-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flash attention support. #20152
base: master
Are you sure you want to change the base?
Flash attention support. #20152
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #20152 +/- ##
==========================================
- Coverage 79.35% 79.31% -0.04%
==========================================
Files 501 501
Lines 47311 47336 +25
Branches 8689 8695 +6
==========================================
+ Hits 37542 37544 +2
- Misses 8014 8037 +23
Partials 1755 1755
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR -- the code looks good! Please add a unit test.
For the JAX version, I think we'd want to rely on a Pallas kernel. We can get help from the JAX team.
This PR is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you. |
I added support for flash attention for PyTorch.
Let me know what do you think about this current implementation so I can add support for JAX and maybe will try for TF.