vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5.3k
Star 34.7k

Code
Issues 1.2k
Pull requests 476
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 56 Milestones 0

New pull request New

476 Open 5,363 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

ROCM llama 3.2 support upstreaming needs-rebase

#12419 opened Jan 24, 2025 by maleksan85

Loading…

[Misc] Add offline test for disaggregated prefill

#12418 opened Jan 24, 2025 by Shaoting-Feng

Loading…

[Bugfix] Disable w16a16 2of4 sparse CompressedTensors24

#12417 opened Jan 24, 2025 by tlrmchlsmth

Loading…

[V1][Metrics] Add initial Prometheus logger

#12416 opened Jan 24, 2025 by markmc

Loading…

[V1] Revert uncache_blocks and support recaching full blocks ready

ONLY add when PR is ready to merge/full CI is needed

#12415 opened Jan 24, 2025 by comaniac

Loading…

[Usage] Add pipeline parallelism for usage stats

#12414 opened Jan 24, 2025 by simon-mo

Loading…

[Bugfix][Kernel] Fix moe align block issue for mixtral ready

ONLY add when PR is ready to merge/full CI is needed

#12413 opened Jan 24, 2025 by ElizaWszola

Loading…

[Bugfix] Fix BLIP-2 processing ready

ONLY add when PR is ready to merge/full CI is needed

#12412 opened Jan 24, 2025 by DarkLight1337

Loading…

[Frontend] Support override generation config in args

#12409 opened Jan 24, 2025 by liuyanyi

Loading…

[ROCm][MoE] MI300 tuned configs Mixtral-8x(7B,22B) | fp16, fp8 ready

ONLY add when PR is ready to merge/full CI is needed

#12408 opened Jan 24, 2025 by divakar-amd

Loading…

[Bugfix] Fix benchmark script bug: inaccurate stats for vllm backend when max_model_len < input_len + output_len

#12407 opened Jan 24, 2025 by WangErXiao

Loading…

[Bugfix][Kernel] FA3 Fix - RuntimeError: This flash attention build only supports pack_gqa (for build size reasons). ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#12405 opened Jan 24, 2025 by LucasWilkinson

Loading…

[ci/build] detect and auto use cxx abi ci/build

#12403 opened Jan 24, 2025 by youkaichao

Loading…

[MISC] add arg pad_for_invariant_seq_len

#12397 opened Jan 24, 2025 by MengqingCao

Loading…

[Bugfix] Fix output_tokens is 0 if using tgi backend

#12394 opened Jan 24, 2025 by sywangyi

Loading…

[torch.compile] PyTorch 2.6 and nightly compatibility

#12393 opened Jan 24, 2025 by youkaichao

Loading…

[Hardware][Intel GPU] add XPU bf16 support documentation

Improvements or additions to documentation

#12392 opened Jan 24, 2025 by jikunshang

Loading…

[V1][Core] Structured decoding on scheduler-level

#12388 opened Jan 24, 2025 by aarnphm • Draft

[Misc] Add BNB quantization for Whisper

#12381 opened Jan 24, 2025 by jeejeelee • Draft

[Frontend] Rerank API (Jina- and Cohere-compatible API) documentation

Improvements or additions to documentation

frontend

#12376 opened Jan 24, 2025 by K-Mistele

Loading…

[Core] add and implement VLLM_LOGITS_PROCESSOR_THREADS

#12368 opened Jan 23, 2025 by akeshet

Loading…

[Hardware][Intel-Gaudi] Enable FusedSDPA support for Intel Gaudi (HPU)

#12359 opened Jan 23, 2025 by SanjuCSudhakaran • Draft

[Misc] Add FA2 support to ViT MHA layer ready

ONLY add when PR is ready to merge/full CI is needed

#12355 opened Jan 23, 2025 by Isotr0py

Loading…

[ROCm] Faster Custom Paged Attention kernels ci/build rocm

#12348 opened Jan 23, 2025 by tjtanaa • Draft

[Bugfix] handle alignment of arguments in convert_sparse_cross_attention_mask_to_dense

#12347 opened Jan 23, 2025 by tjohnson31415

Loading…

Previous 1 2 3 4 5 … 19 20 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly