Skip to content

Commit

Permalink
Document Sync by Tina
Browse files Browse the repository at this point in the history
  • Loading branch information
Chivier committed Sep 15, 2024
1 parent 2c596c6 commit 68ed6f6
Showing 1 changed file with 8 additions and 1 deletion.
9 changes: 8 additions & 1 deletion docs/stable/store/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,14 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))

## Usage with vLLM

To use ServerlessLLM as a load format for vLLM, you need to apply our patch `serverless_llm/store/vllm_patch/sllm_load.patch` to the installed vLLM library. Therefore, please make sure you have read and followed the steps in the `vLLM Patch` section under our [installation guide](../getting_started/installation.md).
:::tip
To use ServerlessLLM as the load format for vLLM, you need to apply our patch `serverless_llm/store/vllm_patch/sllm_load.patch` to the installed vLLM library. Therefore, please ensure you have applied our `vLLM Patch` as instructed in [installation guide](../getting_started/installation.md).
```bash
VLLM_PATH=$(python -c "import vllm; import os; print(os.path.dirname(os.path.abspath(vllm.__file__)))")
patch -p2 -d $VLLM_PATH < serverless_llm/store/vllm_patch/sllm_load.patch
```
:::


Our api aims to be compatible with the `sharded_state` load format in vLLM. Thus, due to the model modifications about the model architecture done by vLLM, the model format for vLLM is **not** the same as we used in transformers. Thus, the `ServerlessLLM format` mentioned in the subsequent sections means the format integrated with vLLM, which is different from the `ServerlessLLM format` used in the previous sections.

Expand Down

0 comments on commit 68ed6f6

Please sign in to comment.