-
Notifications
You must be signed in to change notification settings - Fork 445
Pull requests: InternLM/lmdeploy
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[side-effect] bring back quantization of qwen2-vl, glm4v and etc.
Bug:P1
#2954
opened Dec 25, 2024 by
lvhan028
Loading…
Fallback to pytorch engine when the model is quantized by smooth quant
improvement
#2953
opened Dec 25, 2024 by
lvhan028
Loading…
[dlinfer] feat: add DlinferFlashAttention to support qwen vl.
#2952
opened Dec 25, 2024 by
Reinerzhou
Loading…
[ci] add w8a8 and internvl2.5 models into testcase
#2949
opened Dec 24, 2024 by
zhulinJulia24
Loading…
[maca] support deepseekv2 for maca backend.
enhancement
New feature or request
#2918
opened Dec 18, 2024 by
Reinerzhou
•
Draft
support Turbomind ep
enhancement
New feature or request
#2883
opened Dec 12, 2024 by
irexyc
Loading…
Support Medusa speculative decoding
enhancement
New feature or request
#2859
opened Dec 5, 2024 by
AllentDan
Loading…
Refactor turbomind attention by precomputing rotary embed
improvement
#2801
opened Nov 25, 2024 by
irexyc
Loading…
[Feature] Support llava onevision
enhancement
New feature or request
#2783
opened Nov 21, 2024 by
deepindeed2022
Loading…
support qwen2-vl with turbomind backend
enhancement
New feature or request
#2720
opened Nov 6, 2024 by
irexyc
Loading…
Run loop.run_until_complete in another thread
improvement
#2701
opened Nov 4, 2024 by
AllentDan
Loading…
[Feature] Support vision module w8a8 inference
improvement
#2308
opened Aug 14, 2024 by
AllentDan
Loading…
better formatted table of 'lmdeploy list'
improvement
WIP
#2289
opened Aug 12, 2024 by
lvhan028
Loading…
[Feature] support qqq(w4a8) for lmdeploy
#2274
opened Aug 9, 2024 by
HandH1998
Loading…
6 tasks done
[Feature] Support XTuner Lite Llava
enhancement
New feature or request
#2191
opened Jul 31, 2024 by
pppppM
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.