-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* feat: Use vllm server * fix: Use self.model * debug * debug * feat: Try using guided_json * fix: Use extra_body * fix: Use json * fix: Use model_json_schema * chore: Include response_format * chore: Logging * chore: Remove logging * debug * fix: Do not break streaming if chunk_str is None * debug * feat: Spawn new vLLM server if not already running * fix: Do not use api_key if running vLLM generator * fix: vLLM config * chore: Remove breakpoint * debug * debug * fix: Set server after booting it * debug * debug * fix: Add sleep after server start * fix: Only require CUDA to start the vLLM inference server, not to use one * fix: Only set `guided_json` if using vLLM * tests: vLLM tests * feat: Add more args to vLLM server * fix: Typo * debug * fix: Up vLLM startup sleep time * debug * debug * debug * debug * fix: Add port back in * fix: Set up self.server in OpenaiGenerator correctly * debug * fix: Store config in VllmGenerator * debug * feat: Check manually if Uvicorn server has started * feat: Block stderr when loading tokenizer * debug * refactor: Use HiddenPrints * fix: Block transformers logging * feat: Add --host back in * debug * fix: Add `del self` in `__del__` * chore: Ignore ResourceWarning in pytest * tests: Initialise the VllmGenerator fewer times in tests * fix: Do not hardcode different ports * tests: Use same VllmGenerator * tests: Remove validity check test, as it is impossible with VllmGenerator * tests: Remove random_seed from VllmGenerator config * docs: Add comments * fix: Raise ValueError in get_component_by_name if module or class don't exist * docs: Update coverage badge * chore: Re-instate pre-commit hook
- Loading branch information
1 parent
25e5fa3
commit 5dfeee2
Showing
7 changed files
with
152 additions
and
145 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,12 @@ | ||
name: vllm | ||
model: ThatsGroes/munin-SkoleGPTOpenOrca-7b-16bit | ||
max_model_len: 10_000 | ||
gpu_memory_utilization: 0.95 | ||
temperature: 0.0 | ||
max_tokens: 256 | ||
stream: true | ||
timeout: 60 | ||
system_prompt: ${..language.system_prompt} | ||
prompt: ${..language.prompt} | ||
max_model_len: 10_000 | ||
gpu_memory_utilization: 0.95 | ||
server: null | ||
port: 8000 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.