Skip to content

CUDA role in Memory Efficient Attention #689

Answered by danthe3rd
BillyGun27 asked this question in Q&A
Discussion options

You must be logged in to vote

Hi,
Memory-efficient attention has been tested on 6.0+, but should work with 5.0+. This kernel relies on CUTLASS which might not work below 5.0.
If you have an older GPU, it might work if you build from source (and with some luck), but I don't recommend that, and we won't support it if you have questions or issues.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by BillyGun27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants