We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In run_vllm_server.py script vLLM server is configured with hf_model_name as
run_vllm_server.py
hf_model_name
def model_setup(hf_model_id): # TODO: check HF repo access with HF_TOKEN supplied print(f"using model: {hf_model_id}") args = { "model": hf_model_id, ...
but in vLLM TT worker implementation we have for example:
if ("meta-llama/Meta-Llama-3.1-8B" in self.model_config.model...
setup.sh generates following vars
$HF_MODEL_REPO_ID=meta-llama/Llama-3.1-8B-Instruct $META_MODEL_NAME=Meta-Llama-3.1-8B-Instruct
but neither of them explicitly matches string expected by vLLM... so we should consolidate these naming conventions.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
In
run_vllm_server.py
script vLLM server is configured withhf_model_name
asbut in vLLM TT worker implementation we have for example:
setup.sh generates following vars
but neither of them explicitly matches string expected by vLLM... so we should consolidate these naming conventions.
The text was updated successfully, but these errors were encountered: