Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tools calling for dspy.LM #2023

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

chenmoneygithub
Copy link
Collaborator

@chenmoneygithub chenmoneygithub commented Jan 8, 2025

Add tools arg for dspy.LM, since litellm natively supports tool calling. We may also consider have dspy.ReAct use the default tool calling from LLM providers for robustness.

A sample code looks like:

import dspy

lm = dspy.LM("openai/gpt-4o-mini")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                },
                "required": ["location"],
            },
        },
    }
]

outputs = lm("What's the weather like in Paris today?", tools=tools)

print(outputs)

The output is:

(dspy) [add-tool-calling][~/Documents/genai/dspy]$ python3 tool_tmp.py
[{'text': None, 'tool_calls': [ChatCompletionMessageToolCall(function=Function(arguments='{"location":"Paris, France"}', name='get_current_weather'), id='call_s7pLflqklT2MekZt8eIOBQDU', type='function')]}]

@@ -86,7 +86,13 @@ def __init__(
), "OpenAI's o1-* models require passing temperature=1.0 and max_tokens >= 5000 to `dspy.LM(...)`"

@with_callbacks
def __call__(self, prompt=None, messages=None, **kwargs):
def __call__(self, prompt=None, messages=None, tools=None, **kwargs):
if tools is not None and not litellm.supports_function_calling(self.model):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chenmoneygithub ! Right now, we have a general solution that works well.

I'm not sure we should replace it with a special solution that doesn't work for many model providers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's a good point. Yuki from Mlflow team reached out asking if we can use the standard tool calling supported by litellm so that we can trace the tool calls. Which kinda makes sense because now we are mixing the tools calls into the message.content, with this approach it's hard to trace the tool calls.

My plan is this:

  • for OAI/Anthropic/Databricks providers, which has standard function calling, we can use the standard way, which is identical across these providers.
  • for other models, e.g., local hosted models, we use our current logic.

I am seeing two downsides:

  1. When we change the logic in dspy.ReAct, there might be some performance change (drop or increase, I am not sure), we need to be cautious.
  2. dspy.ReAct will have 2 branches, one for LLms that support tool calling, the other for LLMs that don't support tool calling. So the code will be slightly more complex: https://github.com/stanfordnlp/dspy/blob/main/dspy/predict/react.py#L85-L96.

And I am seeing 2 benefits:

  1. We can enable clean tool calling tracing, which is useful for debugging complex agents.
  2. I expect the tool calling to be more robust with OAI/Anthropic if we use their protocal.

Please let me know your thoughts!

@chenmoneygithub chenmoneygithub requested a review from okhat January 9, 2025 02:43
@@ -110,6 +116,8 @@ def __call__(self, prompt=None, messages=None, **kwargs):
}
for c in response["choices"]
]
elif tools:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the if statements here need to be refactored. Right now, you can only ask for logprobs or tools, but not both.

I think we should move the if statement about logprobs and tools etc inside the loop over choices?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants