-
Notifications
You must be signed in to change notification settings - Fork 924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model request: Phi 3 mini 128K #432
Comments
Phi3-mini, StableLM 1.6B, Qwen 1.8B were just added to the prebuilt list here: #433 Will bump the version to 0.2.39 soon. Note the phi3 we added was 4k instead of 128K. If I understand correctly, to support 128K context length, we need to allocate a KV cache that is 128K on the sequence dimension, which yields to |
Just published 0.2.39; those models are now included in the prebuilt app config! |
Very nice. If you don't mind I'll keep this open for now? I think the 128K context version would still offer something valueable to WebLLM. |
npm 0.2.62 now supports Phi3.5-mini: #556 Phi-3.5-mini comes with support up to 128K context (unlike Phi-3-mini which only has 4k) thanks to rope scaling which MLC-LLM supports, which you can take advantage of in WebLLM by increasing |
Closing this issue for now as Phi-3.5 should suffice the need described. Feel free to open new ones if new issues arise! |
Brilliant, thank you! |
Seems like a good match for WebLLM, as it was practicaly designed to run in the browser.
From this reddit thread:
https://www.reddit.com/r/LocalLLaMA/comments/1d2o445/comment/l63cvxk/
The text was updated successfully, but these errors were encountered: