Skip to content

Pull requests: HabanaAI/vllm-fork

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[WIP] Add HPU support to vLLM v1
#487 opened Nov 12, 2024 by kzawora-intel Draft
6 of 12 tasks
Update Dockerfile.hpu
#486 opened Nov 12, 2024 by michalkuligowski Loading…
Nov 12 rebase
#485 opened Nov 12, 2024 by kzawora-intel Loading…
Enable DeepseekV2 Lite/Chat models
#482 opened Nov 11, 2024 by hlin99 Loading…
GPTQ Support [Cont.]
#481 opened Nov 8, 2024 by maktukmak Loading…
Overhaul padding aware scheduling
#479 opened Nov 8, 2024 by kzawora-intel Loading…
Add FP8 TP=2 scenario to Jenkins CI enhancement New feature or request habana Issues or PRs submitted by Habana Labs
#478 opened Nov 8, 2024 by afierka-intel Draft
enable acc for benchmark_throughput
#472 opened Nov 6, 2024 by hsubramony Loading…
[DO NOT MERGE] Upstream codebase diff habana Issues or PRs submitted by Habana Labs
#470 opened Nov 6, 2024 by kzawora-intel Draft
AWQ Support
#458 opened Nov 4, 2024 by maktukmak Loading…
Config hidden layer number to run in 1 lazy graph
#451 opened Nov 1, 2024 by libinta Loading…
to make repetition penalty faster
#442 opened Oct 29, 2024 by ccrhx4 Loading…
Add models-tiny CI step with Llama3.2-1B habana Issues or PRs submitted by Habana Labs
#440 opened Oct 28, 2024 by kzawora-intel Draft
Add HPU information to collect_env script habana Issues or PRs submitted by Habana Labs
#430 opened Oct 25, 2024 by michalkuligowski Loading…
GPTQ Support
#421 opened Oct 23, 2024 by maktukmak Loading…
[PoC] Add max padding ratio to padding aware scheduler habana Issues or PRs submitted by Habana Labs
#407 opened Oct 18, 2024 by kzawora-intel Draft
Create run-lm-eval-mmlu.sh habana Issues or PRs submitted by Habana Labs
#399 opened Oct 16, 2024 by michalkuligowski Draft
WA for OOM in qwen 2 - sync after loading weights habana Issues or PRs submitted by Habana Labs
#398 opened Oct 16, 2024 by michalkuligowski Loading…
[bucketing overhaul 2/n] Delegate bucket management to HPUBucketingContext habana Issues or PRs submitted by Habana Labs
#395 opened Oct 15, 2024 by kzawora-intel Loading…
Add bucket calibration, allow reading/writing bucketing configs to file habana Issues or PRs submitted by Habana Labs
#345 opened Sep 27, 2024 by kzawora-intel Loading…
Optimize LoRA mask creation habana Issues or PRs submitted by Habana Labs
#285 opened Sep 13, 2024 by SanjuCSudhakaran Draft
[build] Changes for RH build external Issues or PRs submitted by external users
#190 opened Aug 15, 2024 by Xaenalt Loading…
ProTip! Follow long discussions with comments:>50.