[Question] Gemma-2 sliding window support? #1017
-
Thanks for your hard work on this project! I'm trying to use gemma-2-27b-it with sglang==0.2.9 and flashinfer backend ==0.1.3. I’m interested in using a context length of 8192 with sliding window attention, which seems to be supported by flashinfer. However, I noticed that the comments in the gemma2 repository suggest that this might not be allowed. Could you clarify whether it's possible to use an 8192 context length with sliding window attention in this setup? Any guidance or suggestions would be greatly appreciated! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @joonkeekim, thanks for your interest! The support of window attention for gemma-2 is under review, see #1056. It is expected to be merged tomorrow. |
Beta Was this translation helpful? Give feedback.
Hi @joonkeekim, thanks for your interest! The support of window attention for gemma-2 is under review, see #1056. It is expected to be merged tomorrow.