Model request: Phi 3 mini 128K #432

flatsiedatsie · 2024-05-29T20:29:52Z

Seems like a good match for WebLLM, as it was practicaly designed to run in the browser.

From this reddit thread:
https://www.reddit.com/r/LocalLLaMA/comments/1d2o445/comment/l63cvxk/

CharlieFRuan · 2024-05-29T21:59:21Z

Phi3-mini, StableLM 1.6B, Qwen 1.8B were just added to the prebuilt list here: #433

Will bump the version to 0.2.39 soon.

Note the phi3 we added was 4k instead of 128K.

If I understand correctly, to support 128K context length, we need to allocate a KV cache that is 128K on the sequence dimension, which yields to head_dim * num_layer * num_kv_heads * {k,v} * size(f16) * 128K bytes, i.e. 96 * 32 * 32 * 2 * 2 * 128000, which is 46GB, as opposed to 1.5GB for 4k context length.

CharlieFRuan · 2024-05-30T05:22:49Z

Just published 0.2.39; those models are now included in the prebuilt app config!

flatsiedatsie · 2024-05-31T07:46:26Z

Very nice.

If you don't mind I'll keep this open for now? I think the 128K context version would still offer something valueable to WebLLM.

CharlieFRuan · 2024-08-23T16:26:56Z

npm 0.2.62 now supports Phi3.5-mini: #556

Phi-3.5-mini comes with support up to 128K context (unlike Phi-3-mini which only has 4k) thanks to rope scaling which MLC-LLM supports, which you can take advantage of in WebLLM by increasing ModelRecord.overrides.context_window_size or specifying it in ChatOptions when loading a model, as long as there is enough VRAM.

CharlieFRuan · 2024-08-23T16:27:46Z

Closing this issue for now as Phi-3.5 should suffice the need described. Feel free to open new ones if new issues arise!

flatsiedatsie · 2024-08-23T17:52:16Z

Brilliant, thank you!

CharlieFRuan closed this as completed Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model request: Phi 3 mini 128K #432

Model request: Phi 3 mini 128K #432

flatsiedatsie commented May 29, 2024

CharlieFRuan commented May 29, 2024

CharlieFRuan commented May 30, 2024

flatsiedatsie commented May 31, 2024

CharlieFRuan commented Aug 23, 2024

CharlieFRuan commented Aug 23, 2024

flatsiedatsie commented Aug 23, 2024

Model request: Phi 3 mini 128K #432

Model request: Phi 3 mini 128K #432

Comments

flatsiedatsie commented May 29, 2024

CharlieFRuan commented May 29, 2024

CharlieFRuan commented May 30, 2024

flatsiedatsie commented May 31, 2024

CharlieFRuan commented Aug 23, 2024

CharlieFRuan commented Aug 23, 2024

flatsiedatsie commented Aug 23, 2024