Replace FasterTransformers like KV cache layout and kernel with flash attention for better support for longer sequence#239
Open
JerryGJX wants to merge 2 commits intomit-han-lab:mainfrom JerryGJX:main
+37-73
Commits
Commits on Nov 16, 2024
- committedJunxian Guo
- committedJunxian Guo