-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Develop] Performance Improving Feature #1105
Comments
1 and 3 are interesting to us. sglang/python/sglang/srt/server_args.py Lines 426 to 430 in 5ff25cd
Please join our Slack channel and we can have more discussions there : https://join.slack.com/t/sgl-fru7574/shared_invite/zt-2ngly9muu-t37XiH87qvD~6rVBTkTEHw |
I have implemented the plan1 in this PR: #1142. |
Contributions are very welcome! https://arxiv.org/pdf/2406.16858 |
We very much welcome features that improve performance. Overall, we hope the PRs submitted can adhere to the following principles:
|
This issue has been automatically closed due to inactivity. Please feel free to reopen it if needed. |
I want to develop some features based on Sglang to improve the performance of srt.
transfer
would be implemented by kv cache swapping to avoid extra computation.Looking forward to everyone's suggestions.😊
The text was updated successfully, but these errors were encountered: