Updates for vllm 0.6.2 #12338

gc-fu · 2024-11-05T08:28:27Z

Description

Updates for vLLm to using vLLM 0.6.2.

We need to change the followings:

Initial Dockerfile
vLLM related updates
update benchmark_latency.py
update benchmark_throughput.py
Examples, in ipex-llm/python/llm/example/GPU/vLLM-Serving
vLLM worker
Update final Dockerfile before merge, this is for changing building branches, check the TODO in the code.
Merge Fix awq and gptq error on vllm0.6.2 analytics-zoo/vllm#47
Test image functionality... Done by Wang, Jun

1. Why the change?

2. User API changes

3. Summary of the change

4. How to test?

N/A
Unit test: Please manually trigger the PR Validation here by inputting the PR number (e.g., 1234). And paste your action link here once it has been successfully finished.
Application test
Document test
...

5. Known issues

Sometimes, this will fail on initial start up, and got timeout error...

gc-fu · 2024-11-06T08:03:36Z

https://github.com/intel-analytics/ipex-llm-workflow/actions/runs/11699472618

xiangyuT · 2024-11-12T07:14:43Z

python/llm/example/GPU/vLLM-Serving/README.md

@@ -17,7 +17,7 @@ In this example, we will run Llama2-7b model using Arc A770 and provide `OpenAI-

 ### 0. Environment

-To use Intel GPUs for deep-learning tasks, you should install the XPU driver and the oneAPI Base Toolkit 2024.0. Please check the requirements at [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU#requirements).
+To use Intel GPUs for deep-learning tasks, you should install the XPU driver and the oneAPI Base Toolkit 2024.1. Please check the requirements at [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU#requirements).


Is the link correspond with oneapi 2024.1? ipex-llm is mainly using 2024.0 on arc?

Fixed using link:https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2024-1/overview.html

xiangyuT · 2024-11-12T07:19:43Z

python/llm/example/GPU/vLLM-Serving/README.md

-VLLM_BUILD_XPU_OPS=1 pip install --no-build-isolation -v -e .
-pip install outlines==0.0.34 --no-deps
-pip install interegular cloudpickle diskcache joblib lark nest-asyncio numba scipy
+VLLM_TARGET_DEVICE=xpu pip install --no-build-isolation -v . && \


remove && \

python/llm/example/GPU/vLLM-Serving/README.md

gc-fu · 2024-11-12T08:45:20Z

https://github.com/intel-analytics/ipex-llm-workflow/actions/runs/11793680121

xiangyuT

lgtm

xiangyuT and others added 5 commits November 4, 2024 21:01

Initial updates for vllm 0.6.2

b3f1e09

fix

30aa803

Change Dockerfile to support v062

f5ee889

Fix

c9eab78

fix examples

6d14b22

gc-fu and others added 5 commits November 7, 2024 20:02

Fix

e956b7b

done

f09963b

fix

f91b396

Update engine.py

6ad0f20

Fix Dockerfile to original path

ec304eb

glorysdj requested a review from xiangyuT November 12, 2024 07:08

fix

290f085

xiangyuT reviewed Nov 12, 2024

View reviewed changes

gc-fu added 3 commits November 12, 2024 15:29

add option

ab32ca4

fix

0f93953

fix

2eeb2f2

xiangyuT reviewed Nov 12, 2024

View reviewed changes

python/llm/example/GPU/vLLM-Serving/README.md Outdated Show resolved Hide resolved

fix

5fd0dce

fix

dbbe5cc

xiangyuT approved these changes Nov 12, 2024

View reviewed changes

gc-fu merged commit 0ee54fc into main Nov 12, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates for vllm 0.6.2 #12338

Updates for vllm 0.6.2 #12338

gc-fu commented Nov 5, 2024 •

edited

Loading

gc-fu commented Nov 6, 2024

xiangyuT Nov 12, 2024

gc-fu Nov 12, 2024

xiangyuT Nov 12, 2024 •

edited

Loading

gc-fu Nov 12, 2024

gc-fu commented Nov 12, 2024

xiangyuT left a comment

Updates for vllm 0.6.2 #12338

Updates for vllm 0.6.2 #12338

Conversation

gc-fu commented Nov 5, 2024 • edited Loading

Description

1. Why the change?

2. User API changes

3. Summary of the change

4. How to test?

5. Known issues

gc-fu commented Nov 6, 2024

xiangyuT Nov 12, 2024

Choose a reason for hiding this comment

gc-fu Nov 12, 2024

Choose a reason for hiding this comment

xiangyuT Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

gc-fu Nov 12, 2024

Choose a reason for hiding this comment

gc-fu commented Nov 12, 2024

xiangyuT left a comment

Choose a reason for hiding this comment

gc-fu commented Nov 5, 2024 •

edited

Loading

xiangyuT Nov 12, 2024 •

edited

Loading