Skip to content

Replace FasterTransformers like KV cache layout and kernel with flash attention for better support for longer sequence#239

Open
JerryGJX wants to merge 2 commits intomit-han-lab:mainfrom JerryGJX:main