-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inference to use provider resource id to register and validate #428
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make sure to change the zero-to-hero notebooks as well?
0f7fcfd
to
948f6ec
Compare
@@ -237,7 +237,7 @@ async def completion( | |||
@webmethod(route="/inference/chat_completion") | |||
async def chat_completion( | |||
self, | |||
model: str, | |||
model_id: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llama-stack-apps examples also needs to be updated: https://github.com/meta-llama/llama-stack-apps/blob/0dc9c42fb42bf21d35e6d231afc4e0360a9eac61/examples/inference/client.py#L46-L49
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llama-stack-apps and llama-stack-client-python also needs to be updated to reflect the model
-> model_id
change.
very useful change! |
This PR changes the way model id gets translated to the final model name that gets passed through the provider.
Major changes include:
Tested all inference providers including together, fireworks, vllm, ollama, meta reference and bedrock