Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/use vllm server #40

Merged
merged 58 commits into from
May 22, 2024
Merged

Feat/use vllm server #40

merged 58 commits into from
May 22, 2024

Conversation

saattrupdan
Copy link
Collaborator

@saattrupdan saattrupdan commented May 22, 2024

This changes the vLLM generator to using the vLLM server instead of the Python interface. This simplifies things a bit, in that we can re-use the OpenaiGenerator code, but is also required to enable streaming with vLLM models.

This defaults to starting a new server in a background process, but also allows running a separate server and setting generator.server=<url-to-server> to use the existing one.

@saattrupdan saattrupdan requested a review from AJDERS May 22, 2024 09:32
@saattrupdan saattrupdan self-assigned this May 22, 2024
@saattrupdan saattrupdan merged commit 5dfeee2 into main May 22, 2024
2 checks passed
@saattrupdan saattrupdan deleted the feat/use-vllm-server branch May 22, 2024 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant