Skip to content

Pull requests: vllm-project/flash-attention

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add sparse attention with vertical and slash
#33 opened Dec 19, 2024 by minminsun Loading…
support KV-Compress paged KV cache
#27 opened Nov 27, 2024 by IsaacRe Loading…
Add CUDA 8.7 arch for Jetson Orin
#26 opened Nov 27, 2024 by conroy-cheers Loading…
Update torch to 2.5.1
#25 opened Nov 7, 2024 by ayakzob Loading…
Don't disable uneven k to support more headdims
#21 opened Sep 27, 2024 by njhill Loading…
Update .gitignore to ignore *env/ directories
#16 opened Aug 8, 2024 by wasertech Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.