Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vllm does not work with image URLs #643

Open
2 tasks
ashwinb opened this issue Dec 17, 2024 · 1 comment
Open
2 tasks

vllm does not work with image URLs #643

ashwinb opened this issue Dec 17, 2024 · 1 comment

Comments

@ashwinb
Copy link
Contributor

ashwinb commented Dec 17, 2024

System Info

...

Information

  • The official example scripts
  • My own modified scripts

🐛 Describe the bug

vLLM does not work when you just pass image URLs

see https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/vllm/vllm.py#L166

if you change download=False, it does not work.

How to test?

# run VLLM first 
docker run --rm -it -e HUGGING_FACE_HUB_TOKEN=... \
  -v /home/ashwin/.cache/huggingface:/root/.cache/huggingface \
  vllm/vllm-openai:latest \
  --trust-remote-code \
  --gpu-memory-utilization 0.75 \
   --model meta-llama/Llama-3.2-11B-Vision-Instruct --enforce-eager \
   --max-model-len 4096 --max-num-seqs 16 --port 6001
pytest -v -s -k vllm tests/inference/test_vision_inference.py \
  --env VLLM_URL=http://localhost:6001/v1

Error logs

On the vLLM logs, you see

INFO:     127.0.0.1:59652 - "POST /v1/chat/completions HTTP/1.1" 200 OK
ERROR 12-16 23:56:17 serving_chat.py:162] Error in loading multi-modal data
ERROR 12-16 23:56:17 serving_chat.py:162] Traceback (most recent call last):
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/client.py", line 663, in _request
ERROR 12-16 23:56:17 serving_chat.py:162]     conn = await self._connector.connect(
ERROR 12-16 23:56:17 serving_chat.py:162]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/connector.py", line 563, in connect
ERROR 12-16 23:56:17 serving_chat.py:162]     proto = await self._create_connection(req, traces, timeout)
ERROR 12-16 23:56:17 serving_chat.py:162]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/connector.py", line 1032, in _create_connection
ERROR 12-16 23:56:17 serving_chat.py:162]     _, proto = await self._create_direct_connection(req, traces, timeout)
ERROR 12-16 23:56:17 serving_chat.py:162]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/connector.py", line 1335, in _create_direct_connection
ERROR 12-16 23:56:17 serving_chat.py:162]     transp, proto = await self._wrap_create_connection(
ERROR 12-16 23:56:17 serving_chat.py:162]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/connector.py", line 1091, in _wrap_create_connection
ERROR 12-16 23:56:17 serving_chat.py:162]     sock = await aiohappyeyeballs.start_connection(
ERROR 12-16 23:56:17 serving_chat.py:162]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohappyeyeballs/impl.py", line 89, in start_connection
ERROR 12-16 23:56:17 serving_chat.py:162]     sock, _, _ = await _staggered.staggered_race(
ERROR 12-16 23:56:17 serving_chat.py:162]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohappyeyeballs/_staggered.py", line 160, in staggered_race
ERROR 12-16 23:56:17 serving_chat.py:162]     done = await _wait_one(
ERROR 12-16 23:56:17 serving_chat.py:162]            ^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohappyeyeballs/_staggered.py", line 41, in _wait_one
ERROR 12-16 23:56:17 serving_chat.py:162]     return await wait_next
ERROR 12-16 23:56:17 serving_chat.py:162]            ^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162] asyncio.exceptions.CancelledError
ERROR 12-16 23:56:17 serving_chat.py:162]
ERROR 12-16 23:56:17 serving_chat.py:162] The above exception was the direct cause of the following exception:
ERROR 12-16 23:56:17 serving_chat.py:162]
ERROR 12-16 23:56:17 serving_chat.py:162] Traceback (most recent call last):
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/serving_chat.py", line 160, in create_chat_completion
ERROR 12-16 23:56:17 serving_chat.py:162]     mm_data = await mm_data_future
ERROR 12-16 23:56:17 serving_chat.py:162]               ^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/chat_utils.py", line 235, in all_mm_data
ERROR 12-16 23:56:17 serving_chat.py:162]     items = await asyncio.gather(*self._items)
ERROR 12-16 23:56:17 serving_chat.py:162]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/vllm/multimodal/utils.py", line 140, in async_get_and_parse_image
ERROR 12-16 23:56:17 serving_chat.py:162]     image = await async_fetch_image(image_url)
ERROR 12-16 23:56:17 serving_chat.py:162]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/vllm/multimodal/utils.py", line 62, in async_fetch_image
ERROR 12-16 23:56:17 serving_chat.py:162]     image_raw = await global_http_connection.async_get_bytes(
ERROR 12-16 23:56:17 serving_chat.py:162]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/vllm/connections.py", line 92, in async_get_bytes
ERROR 12-16 23:56:17 serving_chat.py:162]     async with await self.get_async_response(url, timeout=timeout) as r:
ERROR 12-16 23:56:17 serving_chat.py:162]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/client.py", line 1359, in __aenter__
ERROR 12-16 23:56:17 serving_chat.py:162]     self._resp: _RetType = await self._coro
ERROR 12-16 23:56:17 serving_chat.py:162]                            ^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/client.py", line 579, in _request
ERROR 12-16 23:56:17 serving_chat.py:162]     with timer:
ERROR 12-16 23:56:17 serving_chat.py:162]          ^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/helpers.py", line 749, in __exit__
ERROR 12-16 23:56:17 serving_chat.py:162]     raise asyncio.TimeoutError from exc_val

Expected behavior

Should have worked as per the documentation of vLLM? I also tried setting VLLM_IMAGE_FETCH_TIMEOUT=20 when starting the vLLM server.

@ashwinb
Copy link
Contributor Author

ashwinb commented Dec 17, 2024

cc @wukaixingxp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@ashwinb and others