InternLM / lmdeploy Public

Notifications You must be signed in to change notification settings
Fork 445
Star 4.9k

Code
Issues 326
Pull requests 32
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: InternLM/lmdeploy

Labels 34 Milestones 0

New pull request New

32 Open 1,247 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix torch_dtype in lite

#2956 opened Dec 25, 2024 by AllentDan

Loading…

Bump version to v0.6.5

#2955 opened Dec 25, 2024 by lvhan028

Loading…

[side-effect] bring back quantization of qwen2-vl, glm4v and etc. Bug:P1

#2954 opened Dec 25, 2024 by lvhan028

Loading…

Fallback to pytorch engine when the model is quantized by smooth quant improvement

#2953 opened Dec 25, 2024 by lvhan028

Loading…

[dlinfer] feat: add DlinferFlashAttention to support qwen vl.

#2952 opened Dec 25, 2024 by Reinerzhou

Loading…

[ci] add w8a8 and internvl2.5 models into testcase

#2949 opened Dec 24, 2024 by zhulinJulia24

Loading…

[maca] support deepseekv2 for maca backend. enhancement

New feature or request

#2918 opened Dec 18, 2024 by Reinerzhou • Draft

Remove threadsafe improvement

#2907 opened Dec 17, 2024 by grimoire

Loading…

Support moe w8a8

#2894 opened Dec 13, 2024 by grimoire

Loading…

[WIP]: use weights iterator while loading

#2886 opened Dec 12, 2024 by RunningLeon

Loading…

support Turbomind ep enhancement

New feature or request

#2883 opened Dec 12, 2024 by irexyc

Loading…

Support Medusa speculative decoding enhancement

New feature or request

#2859 opened Dec 5, 2024 by AllentDan

Loading…

[maca] add cudagraph support on maca backend.

#2834 opened Nov 29, 2024 by Reinerzhou • Draft

Refactor turbomind attention by precomputing rotary embed improvement

#2801 opened Nov 25, 2024 by irexyc

Loading…

[Feature] Support llava onevision enhancement

New feature or request

#2783 opened Nov 21, 2024 by deepindeed2022

Loading…

support qwen2-vl with turbomind backend enhancement

New feature or request

#2720 opened Nov 6, 2024 by irexyc

Loading…

Run loop.run_until_complete in another thread improvement

#2701 opened Nov 4, 2024 by AllentDan

Loading…

update pre-commit config

#2683 opened Oct 30, 2024 by lvhan028

Loading…

support release pipeline improvement

#2581 opened Oct 11, 2024 by irexyc

Loading…

Torchrun launching multiple api_server improvement

#2402 opened Aug 30, 2024 by AllentDan

Loading…

[Feature] Support vision module w8a8 inference improvement

#2308 opened Aug 14, 2024 by AllentDan

Loading…

better formatted table of 'lmdeploy list' improvement WIP

#2289 opened Aug 12, 2024 by lvhan028

Loading…

[Feature] support qqq(w4a8) for lmdeploy

#2274 opened Aug 9, 2024 by HandH1998

Loading…

6 tasks done

[Feature] Support XTuner Lite Llava enhancement

New feature or request

#2191 opened Jul 31, 2024 by pppppM

Loading…

Add prefix cache stats to usage

#2018 opened Jul 13, 2024 by ispobock

Loading…

Previous 1 2 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly