Inference to use provider resource id to register and validate #428

dineshyv · 2024-11-12T18:19:23Z

This PR changes the way model id gets translated to the final model name that gets passed through the provider.
Major changes include:

Providers are responsible for registering an object and as part of the registration returning the object with the correct provider specific name of the model provider_resource_id
To help with the common look ups different names a new ModelLookup class is created.

Tested all inference providers including together, fireworks, vllm, ollama, meta reference and bedrock

llama_stack/providers/remote/inference/vllm/vllm.py

raghotham

can you make sure to change the zero-to-hero notebooks as well?

llama_stack/providers/remote/inference/vllm/vllm.py

llama_stack/providers/tests/inference/test_text_inference.py

yanxi0830 · 2024-11-13T01:05:50Z

llama_stack/apis/inference/inference.py

@@ -237,7 +237,7 @@ async def completion(
    @webmethod(route="/inference/chat_completion")
    async def chat_completion(
        self,
-        model: str,
+        model_id: str,


llama-stack-apps examples also needs to be updated: https://github.com/meta-llama/llama-stack-apps/blob/0dc9c42fb42bf21d35e6d231afc4e0360a9eac61/examples/inference/client.py#L46-L49

https://llama-stack.readthedocs.io/en/latest/getting_started/index.html#chat-completion-test --> also these docs

llama-stack-apps and llama-stack-client-python also needs to be updated to reflect the model -> model_id change.

https://github.com/meta-llama/llama-stack-apps/blob/0dc9c42fb42bf21d35e6d231afc4e0360a9eac61/examples/inference/client.py#L46-L49

llama_stack/providers/utils/inference/model_registry.py

ashwinb · 2024-11-13T02:47:25Z

very useful change!

dineshyv requested review from ashwinb, yanxi0830, hardikjshah, dltn and raghotham as code owners November 12, 2024 18:19

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 12, 2024

yanxi0830 reviewed Nov 12, 2024

View reviewed changes

llama_stack/providers/remote/inference/vllm/vllm.py Outdated Show resolved Hide resolved

raghotham reviewed Nov 12, 2024

View reviewed changes

Dinesh Yeduguru and others added 9 commits November 12, 2024 14:26

use provider resource id to validate for models

95b7f57

fix model provider validation and inference params

d69f4f8

fix bedrock

25d8ab0

working fireworks and together

8de4cee

ollama and databricks

5b2282a

ollama

71219b4

vllm

92ee627

bedrock

d587473

fixes for all providers

948f6ec

dineshyv force-pushed the dineshyv/stackid-to-providerid branch from 0f7fcfd to 948f6ec Compare November 12, 2024 22:30

dineshyv and others added 2 commits November 12, 2024 15:37

fixes after rebase

919d421

run openapi gen

55d66ca

yanxi0830 reviewed Nov 13, 2024

View reviewed changes

llama_stack/providers/remote/inference/vllm/vllm.py Outdated Show resolved Hide resolved

yanxi0830 reviewed Nov 13, 2024

View reviewed changes

llama_stack/providers/tests/inference/test_text_inference.py Show resolved Hide resolved

yanxi0830 reviewed Nov 13, 2024

View reviewed changes

raghotham approved these changes Nov 13, 2024

View reviewed changes

fix evals and scoring

606df22

ashwinb reviewed Nov 13, 2024

View reviewed changes

llama_stack/providers/utils/inference/model_registry.py Outdated Show resolved Hide resolved

remove model lookup class

1bb01f9

dineshyv merged commit fdff24e into main Nov 13, 2024
2 checks passed

dineshyv deleted the dineshyv/stackid-to-providerid branch November 13, 2024 04:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference to use provider resource id to register and validate #428

Inference to use provider resource id to register and validate #428

dineshyv commented Nov 12, 2024 •

edited

Loading

raghotham left a comment

yanxi0830 Nov 13, 2024 •

edited

Loading

yanxi0830 Nov 13, 2024

dineshyv Nov 13, 2024

yanxi0830 Nov 13, 2024

ashwinb commented Nov 13, 2024

Inference to use provider resource id to register and validate #428

Inference to use provider resource id to register and validate #428

Conversation

dineshyv commented Nov 12, 2024 • edited Loading

raghotham left a comment

Choose a reason for hiding this comment

yanxi0830 Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

yanxi0830 Nov 13, 2024

Choose a reason for hiding this comment

dineshyv Nov 13, 2024

Choose a reason for hiding this comment

yanxi0830 Nov 13, 2024

Choose a reason for hiding this comment

ashwinb commented Nov 13, 2024

dineshyv commented Nov 12, 2024 •

edited

Loading

yanxi0830 Nov 13, 2024 •

edited

Loading