[bucketing overhaul 2/n] Delegate bucket management to HPUBucketingContext #395

kzawora-intel · 2024-10-15T14:38:30Z

Requires #394

This PR moves the bucket creation and management logic to HPUBucketingContext class. With this, model runner should not have to ever write to HPUBucketingGlobalState, and all bucketing functions and classes (first ~400 or so lines of hpu model runner) can be moved in next PR to vllm_hpu_extension and be covered with unit tests.

michalkuligowski · 2024-10-21T08:27:56Z

Seems that cpu_test fails on failure to read some environmental variable:
/home/runner/work/vllm-fork/vllm-fork/vllm/worker/hpu_model_runner.py", line 210

kzawora-intel added 5 commits October 15, 2024 16:11

Add padding-aware scheduling

38b044b

format.sh

3ec55be

fix scheduler bugs

ea1ffaa

remove debug stuff

c888889

Delegate bucket management to HPUBucketingContext

f66c9c3

kzawora-intel added the habana Issues or PRs submitted by Habana Labs label Nov 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bucketing overhaul 2/n] Delegate bucket management to HPUBucketingContext #395

[bucketing overhaul 2/n] Delegate bucket management to HPUBucketingContext #395

kzawora-intel commented Oct 15, 2024

michalkuligowski commented Oct 21, 2024

[bucketing overhaul 2/n] Delegate bucket management to HPUBucketingContext #395

Are you sure you want to change the base?

[bucketing overhaul 2/n] Delegate bucket management to HPUBucketingContext #395

Conversation

kzawora-intel commented Oct 15, 2024

michalkuligowski commented Oct 21, 2024