Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fireworks support for tool_choice=required #656

Closed
aidando73 opened this issue Dec 19, 2024 · 2 comments
Closed

Fireworks support for tool_choice=required #656

aidando73 opened this issue Dec 19, 2024 · 2 comments

Comments

@aidando73
Copy link
Contributor

aidando73 commented Dec 19, 2024

🚀 Describe the new functionality needed

Support for tool_choice=required (in inference).

When testing against fireworks, they support tool_choice=any (which semantically means the same thing) [1].

💡 Why is this needed? What if we don't build it?

There are use-cases where you rely on tool calls being generated but llama-stack generates 0 tool calls. I've created a repro here: aidando73/llama-code-review#3.

In that example, Llama-stack gives me 0 tool calls. But hitting the fireworks api directly I get pretty solid tool calls.

If we don't build this then users would be forced to hit the API directly instead of using llama-stack

@aidando73
Copy link
Contributor Author

aidando73 commented Dec 19, 2024

Here's a POC pr: #657

This uses fireworks chat completion api - so we're relying on their implementation.

@aidando73
Copy link
Contributor Author

aidando73 commented Dec 19, 2024

Oh - I realize now that the problem I was having is because we haven't updated prompt_adapter.py for 3.3 models yet. The prompt format for 3.3 works the same as 3.1: https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_3. So we need to update prompt_adapter.py to:

    if model.model_family == ModelFamily.llama3_1 or (
        model.model_family == ModelFamily.llama3_2
        and is_multimodal(model.core_model_id)
    ) or model.model_family == ModelFamily.llama3_3:
        # llama3.1, llama3.2 multimodal and llama3.3 models follow the same tool prompt format
        messages = augment_messages_for_tools_llama_3_1(request)
    elif model.model_family == ModelFamily.llama3_2:
        messages = augment_messages_for_tools_llama_3_2(request)
    else:
        messages = request.messages

But is it just me or is fireworks a bit more consistent with their tool calls when tool_choice=any - are they doing something special under the hood 🤔?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant