Skip to content

Commit

Permalink
Document Sync by Tina
Browse files Browse the repository at this point in the history
  • Loading branch information
Chivier committed Aug 13, 2024
1 parent 3f2d557 commit bcc603e
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/stable/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ sidebar_position: 1
# Serverless LLM

<!-- Scaled logo -->
<img src="../images/serverlessllm.jpg" alt="ServerlessLLM" width="256px">
<img src="../images/serverlessllm.jpg" alt="ServerlessLLM" width="256px"/>

ServerlessLLM is a **fast** and **easy-to-use** serving system designed for **affordable** multi-LLM serving, also known as LLM-as-a-Service. ServerlessLLM is ideal for environments with multiple LLMs that need to be served on limited GPU resources, as it enables efficient dynamic loading of LLMs onto GPUs. By elastically scaling model instances and multiplexing GPUs, ServerlessLLM can significantly reduce costs compared to traditional GPU-dedicated serving systems while still providing low-latency (Time-to-First-Token, TTFT) LLM completions.

Expand Down

0 comments on commit bcc603e

Please sign in to comment.