diff --git a/.github/scripts/spellcheck_conf/wordlist.txt b/.github/scripts/spellcheck_conf/wordlist.txt index f7b6bc249..460b49a76 100644 --- a/.github/scripts/spellcheck_conf/wordlist.txt +++ b/.github/scripts/spellcheck_conf/wordlist.txt @@ -1451,3 +1451,8 @@ openhathi sarvam subtask acc +CrowdStrike +builtins +cybersecurity +wordlwide +builtins \ No newline at end of file diff --git a/recipes/3p_integrations/octoai/builtins/README.md b/recipes/3p_integrations/octoai/builtins/README.md new file mode 100644 index 000000000..0130778cf --- /dev/null +++ b/recipes/3p_integrations/octoai/builtins/README.md @@ -0,0 +1,35 @@ +# Using Llama3.1 built in tools +Meta's latest Llama3.1 models offer unique function calling capabilities. In particular they offer built-in tool calling capabilities for the following 3 external tools: +* Brave Search: internet search +* Code Interpreter: Python code interpreter +* Wolfram Alpha: mathematical and scientific knowledge tool + +To sell the benefits of the built in tools, let's look at what one would get back from an LLM with or without tool calling capabilities. In particular: + +### Code Interpreter +User Query: `I just got a 25 year mortgage of 400k at a fixed rate of 5.14% and paid 20% down. How much will I pay in interest?` +* Answer without tool calls (wrong): `Total paid interest: $184,471` +* Answer with tool calls (correct): `you will pay a total of $249,064.70 in interest` + +### Brave Search +User Query: `What caused a wordlwide outage in the airlines industry in July of 2024?` +* Answer without tool calls: `I'm not aware of anything that would have caused a worldwide outage in the airlines industry.` +* Answer with tool calls: `The global technology outage was caused by a faulty software update that affected Windows programs running cybersecurity technology from CrowdStrike. The outage disrupted flights, media outlets, hospitals, small businesses, and government offices, highlighting the vulnerability of the world's interconnected systems.` + +### Wolfram Alpha +User Query: `Derive the prime factorization of 892041` +* Answer without tool calls (wrong): `The prime factorization of 892041 is:\n\n2 × 2 × 2 × 3 × 3 × 3 × 5 × 13 × 17 × 17` +* Answer with tool calls (correct): `The prime factorization of 892041 is 3 × 17 × 17491.` + +## What you will build +You will learn how to make use of these built in capabilities to address some of the notorious weaknesses of LLMs: +* Limited ability to reason about complex mathematical notions +* Limited ability to answer questions about current events (or data that wasn't included in the model's training set) + +## What you will use +You'll learn to invoke Llama3.1 models hosted on OctoAI, and make use of the model's built in tool calling capabilities via a standardized OpenAI-compatible chat completions API. + +## Instructions +Make sure you have Jupyter Notebook installed in your environment before launching the notebook in the `recipes/use_cases/tool_calling/builtins` directory. + +The rest of the instructions are described in the notebook itself. diff --git a/recipes/3p_integrations/octoai/builtins/llama31_tools.ipynb b/recipes/3p_integrations/octoai/builtins/llama31_tools.ipynb new file mode 100644 index 000000000..263e5e6d3 --- /dev/null +++ b/recipes/3p_integrations/octoai/builtins/llama31_tools.ipynb @@ -0,0 +1,798 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "207b3789-eb35-42a4-a6ba-23966ac56a12", + "metadata": {}, + "source": [ + "# Python Package Setup" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "46477225-22df-4fe5-9b6d-982fdf74634c", + "metadata": {}, + "outputs": [], + "source": [ + "! pip install -r requirements.txt" + ] + }, + { + "cell_type": "markdown", + "id": "ab52f04d-d241-439d-b3f0-8735ad566a00", + "metadata": {}, + "source": [ + "# OctoAI Setup\n", + "\n", + "OctoAI provides inference endpoints to the best and latest open source LLMs out there, including Meta's newly released Llama3.1 models:\n", + "* meta-llama-3.1-8b-instruct\n", + "* meta-llama-3.1-70b-instruct\n", + "* meta-llama-3.1-405b-instruct\n", + "\n", + "For the examples below, we'll use the 70b model variant.\n", + "\n", + "Create an account on [OctoAI](https://octoai.cloud/) via your Google account or GitHub account, and create an OctoAI token that we'll enter below.\n", + "\n", + "If you're creating an account for the first time, you'll be given $10, which gives you lots of LLM usage to start your OctoAI journey with. You can find out more about LLM pricing [here](https://octo.ai/docs/getting-started/pricing-and-billing#text-gen-solution)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ab78649c-e7d6-44df-8d9d-66c3234c49c7", + "metadata": {}, + "outputs": [], + "source": [ + "import getpass\n", + "import os\n", + "\n", + "os.environ[\"OCTOAI_API_KEY\"] = getpass.getpass()" + ] + }, + { + "cell_type": "markdown", + "id": "9c47df4f-7d2b-444e-a9ed-1096320837a1", + "metadata": {}, + "source": [ + "## OctoAI test\n", + "\n", + "Let's run the cell below to ensure we're all properly set up." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7f9630be-d30a-44d2-b3a4-e9433ef22d26", + "metadata": {}, + "outputs": [], + "source": [ + "from openai import OpenAI\n", + "import json\n", + "\n", + "client = OpenAI(\n", + " base_url=\"https://text.octoai.run/v1\",\n", + " api_key=os.environ[\"OCTOAI_API_KEY\"]\n", + ")\n", + "\n", + "model = \"meta-llama-3.1-405b-instruct\"\n", + "\n", + "completion = client.chat.completions.create(\n", + " model=model,\n", + " messages=[\n", + " {\"role\": \"user\", \"content\": \"Hello!\"},\n", + " ]\n", + ")\n", + "\n", + "print(completion.choices[0].message)" + ] + }, + { + "cell_type": "markdown", + "id": "c4a6b90d-ee4d-413d-b61f-915bc25016d5", + "metadata": {}, + "source": [ + "# Wolfram and Brave Search Setup\n", + "\n", + "In this notebook you'll need to use two external tools: Wolfram Alpha and Brave Search. You can get set up by following the links below and obtaining the API keys to drive the examples we'll look into next.\n", + "\n", + "Generate a Wolfram App ID for the simple API here on it's [developer portal](https://developer.wolframalpha.com/)\n", + "\n", + "Generate a Brave Search API key on the [Search API dashboard](https://api.search.brave.com/login)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d6e4856e-b07a-41f1-ab50-6197c1b76c2b", + "metadata": {}, + "outputs": [], + "source": [ + "os.environ[\"BRAVE_API_KEY\"] = getpass.getpass()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7e0e9511-5425-45d3-ada2-52630ce77bf3", + "metadata": {}, + "outputs": [], + "source": [ + "os.environ[\"WOLFRAM_APP_ID\"] = getpass.getpass()" + ] + }, + { + "cell_type": "markdown", + "id": "61534d43-dbc7-4d98-9b27-5e98460b5ea3", + "metadata": {}, + "source": [ + "# LLM Limitations\n", + "\n", + "LLMs have several widely known limitations:\n", + "* They are very limited at resolving mathematical questions on their own (e.g. \"I just got a 25 year mortgage of 400k at a fixed rate of 5.14% and paid 20% down. How much will I pay in interest?\")\n", + "* The parametric memory of an LLM is limited to the data it sees at training time. When asked about recent events, LLMs on their own won't be able to answer the question or even worse, end up hallucinating the answer (e.g. \"What caused severe airlines industry disruptions in July of 2024\"?)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f850a2b8-dbd9-4e35-90ab-c20b0db473c9", + "metadata": {}, + "outputs": [], + "source": [ + "# Example of limitation #1\n", + "# The model is hallucinating the result which is bad because it is answering so confidently.\n", + "completion = client.chat.completions.create(\n", + " model=model,\n", + " messages=[\n", + " {\"role\": \"user\", \"content\": \"I just got a 25 year mortgage of 400k at a fixed rate of 5.14% and paid 20% down. How much will I pay in interest?\"}\n", + " ]\n", + ")\n", + "\n", + "print(completion.choices[0].message)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b48d9ac6-ae47-4c33-97da-1dd0a8470e43", + "metadata": {}, + "outputs": [], + "source": [ + "# Example of limitation #2\n", + "# The model is unable to answer a question on an event that post-dates its training data.\n", + "completion = client.chat.completions.create(\n", + " model=model,\n", + " messages=[\n", + " {\"role\": \"user\", \"content\": \"What caused severe airlines industry disruptions in July of 2024\"}\n", + " ]\n", + ")\n", + "\n", + "print(completion.choices[0].message)" + ] + }, + { + "cell_type": "markdown", + "id": "6d2ea225-d545-4937-8441-655a2fa7c2e2", + "metadata": {}, + "source": [ + "## Tools to the Rescue!\n", + "\n", + "Thankfully Llama3.1 has built-in tool calling capabilities for 3 very handy external tools:\n", + "* Python interpreter which helps us run arbitrary Python code to help answer some questions that LLMs cannot answer on their own (e.g. questions that require advanced mathematical reasoning)\n", + "* Web search which greatly expands the LLM's ability to access information on recent and live events\n", + "* Wolfram Alpha which lets the LLM run queries across many topics including Mathematics, Science&Technology, Society and Culture, and Every Day life\n", + "\n", + "Together these three built-in tools vastly expand the LLM's ability to provide helpful answers as we'll see in the examples below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bc80b31b-e8b9-4472-8038-1ba5e8d1d12f", + "metadata": {}, + "outputs": [], + "source": [ + "# Required for python interpreter\n", + "import sys\n", + "from io import StringIO\n", + "import contextlib\n", + "# Required for Brave Search\n", + "from brave import Brave\n", + "# Required for Wolfram Alpha\n", + "from wolframalpha import Client\n", + "import nest_asyncio\n", + "\n", + "# Need to run this for Wolfram client lib\n", + "nest_asyncio.apply()\n", + "\n", + "# Helper for code interpreter\n", + "@contextlib.contextmanager\n", + "def stdoutIO(stdout=None):\n", + " old = sys.stdout\n", + " if stdout is None:\n", + " stdout = StringIO()\n", + " sys.stdout = stdout\n", + " yield stdout\n", + " sys.stdout = old\n", + "\n", + "# Code interpreter definition\n", + "def code_interpreter(code: str) -> str:\n", + " with stdoutIO() as s:\n", + " exec(\"{}\".format(code))\n", + " return s.getvalue()\n", + "\n", + "# Brave search definition\n", + "def brave_search(query: str) -> str:\n", + " brave = Brave(os.environ[\"BRAVE_API_KEY\"])\n", + " results = brave.search(q=query, count=10)\n", + " return str(results)\n", + "\n", + "# Wolfram Alpha definition\n", + "def wolfram_alpha(query: str) -> str:\n", + " client = Client(os.environ[\"WOLFRAM_APP_ID\"])\n", + " return str(client.query(query))\n", + "\n", + "# Names to functions dictionary to get tools response\n", + "names_to_functions = {\n", + " \"code_interpreter\": code_interpreter,\n", + " \"brave_search\": brave_search,\n", + " \"wolfram_alpha\": wolfram_alpha\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "aa9983b1-3662-45ac-b68f-9878d8307457", + "metadata": {}, + "source": [ + "# Brave Search\n", + "\n", + "Brave search is excellent for providing context to the LLM on recent events, or real time information (e.g. weather)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d7bb27ca-4f31-47a0-8d96-5632143a7392", + "metadata": {}, + "outputs": [], + "source": [ + "messages=[\n", + " {\n", + " \"role\": \"system\",\n", + " \"content\": \"You are a helpful assistant.\"\n", + " },\n", + " {\n", + " \"role\": \"user\",\n", + " \"content\": \"What caused severe airlines industry disruptions in July of 2024?\"\n", + " }\n", + "]\n", + "\n", + "# First LLM inference\n", + "completion = client.chat.completions.create(\n", + " model=model,\n", + " messages=messages,\n", + " temperature=0,\n", + " tools=[{\"type\": \"function\", \"function\": {\"name\": \"brave_search\"}}]\n", + ")\n", + "\n", + "# Append the assistant response to messages\n", + "assistant_response = completion.choices[0].message\n", + "messages.append(\n", + " {\n", + " \"role\": \"assistant\",\n", + " \"content\": \"\",\n", + " \"tool_calls\": completion.choices[0].message.tool_calls\n", + " }\n", + ")\n", + "\n", + "print(completion.choices[0].message)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6c8459e2-3e09-4017-95e0-deddf38712b8", + "metadata": {}, + "outputs": [], + "source": [ + "# Derive the tool call information\n", + "tool_call = completion.choices[0].message.tool_calls[0]\n", + "function_name = tool_call.function.name\n", + "function_params = json.loads(tool_call.function.arguments)\n", + "\n", + "# Compute the results\n", + "function_result = names_to_functions[function_name](**function_params)\n", + "\n", + "# Append to the tools response\n", + "messages.append(\n", + " {\n", + " \"role\": \"tool\",\n", + " \"content\": function_result\n", + " }\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b519da53-31f4-485d-9685-4bceae3a3e2e", + "metadata": {}, + "outputs": [], + "source": [ + "# Formulate the final LLM answer based on the provided context\n", + "completion = client.chat.completions.create(\n", + " model=model,\n", + " messages=messages,\n", + " temperature=0,\n", + " tools=[{\"type\": \"function\", \"function\": {\"name\": \"brave_search\"}}]\n", + ")\n", + "\n", + "print(completion.choices[0].message.content)" + ] + }, + { + "cell_type": "markdown", + "id": "ebcfe7e3-5697-4c6d-9700-535570bd430a", + "metadata": {}, + "source": [ + "# Python Interpreter\n", + "\n", + "Python interpreter is great for solving complex mathematical or coding challenges." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "736d22e5-2f09-41fb-9696-09706fbc3794", + "metadata": {}, + "outputs": [], + "source": [ + "messages=[\n", + " {\n", + " \"role\": \"system\",\n", + " \"content\": \"You are a personal math tutor. When asked a math question, write and run code to answer the question.\"\n", + " },\n", + " {\n", + " \"role\": \"user\",\n", + " \"content\": \"I just got a 25 year mortgage of 400k at a fixed rate of 5.14% and paid 20% down. How much will I pay in interest?\"\n", + " }\n", + "]\n", + "\n", + "# First LLM inference\n", + "completion = client.chat.completions.create(\n", + " model=model,\n", + " messages=messages,\n", + " temperature=0,\n", + " tools=[{\"type\": \"function\", \"function\": {\"name\": \"code_interpreter\"}}]\n", + ")\n", + "\n", + "# Append the assistant response to messages\n", + "assistant_response = completion.choices[0].message\n", + "messages.append(\n", + " {\n", + " \"role\": \"assistant\",\n", + " \"content\": \"\",\n", + " \"tool_calls\": completion.choices[0].message.tool_calls\n", + " }\n", + ")\n", + "\n", + "print(json.loads(completion.choices[0].message.tool_calls[0].function.arguments)[\"code\"])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d5729d4f-96e8-4a26-b8e5-25b59ab4a0bc", + "metadata": {}, + "outputs": [], + "source": [ + "# Derive the tool call information\n", + "tool_call = completion.choices[0].message.tool_calls[0]\n", + "function_name = tool_call.function.name\n", + "function_params = json.loads(tool_call.function.arguments)\n", + "\n", + "# Compute the results\n", + "function_result = names_to_functions[function_name](**function_params)\n", + "print(function_result)\n", + "\n", + "# Append to the tools response\n", + "messages.append(\n", + " {\n", + " \"role\": \"tool\",\n", + " \"content\": function_result\n", + " }\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "327f8b3c-df89-46f8-8622-8d83c259027a", + "metadata": {}, + "outputs": [], + "source": [ + "# Formulate the final LLM answer based on the provided context\n", + "completion = client.chat.completions.create(\n", + " model=model,\n", + " messages=messages,\n", + " temperature=0,\n", + " tools=[{\"type\": \"function\", \"function\": {\"name\": \"code_interpreter\"}}]\n", + ")\n", + "\n", + "print(completion.choices[0].message.content)" + ] + }, + { + "cell_type": "markdown", + "id": "bfe6115b-cb24-4c87-8428-6d87cffc67eb", + "metadata": {}, + "source": [ + "# Wolfram alpha\n", + "\n", + "Wolfram alpha is a great tool for answering even more challenging mathematical or scientific questions that the python interpreter would perform poorly on." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fb9196a3-5d76-4717-b934-857f75c938ad", + "metadata": {}, + "outputs": [], + "source": [ + "messages=[\n", + " {\n", + " \"role\": \"system\",\n", + " \"content\": \"You are a helpful assistant. Use the wolfram_alpha tool for all user queries.\"\n", + " },\n", + " {\n", + " \"role\": \"user\",\n", + " \"content\": \"What are the poles of (z^2-4) / ((z-2)^4*(z^2+5z+7))\"\n", + " }\n", + "]\n", + "\n", + "# First LLM inference\n", + "completion = client.chat.completions.create(\n", + " model=model,\n", + " messages=messages,\n", + " temperature=0,\n", + " tools=[{\"type\": \"function\", \"function\": {\"name\": \"wolfram_alpha\"}}]\n", + ")\n", + "\n", + "# Append the assistant response to messages\n", + "assistant_response = completion.choices[0].message\n", + "messages.append(\n", + " {\n", + " \"role\": \"assistant\",\n", + " \"content\": \"\",\n", + " \"tool_calls\": completion.choices[0].message.tool_calls\n", + " }\n", + ")\n", + "\n", + "print(completion.choices[0].message)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a465045c-4502-4ecd-8c9d-4edbdb8d23e8", + "metadata": {}, + "outputs": [], + "source": [ + "# Derive the tool call information\n", + "tool_call = completion.choices[0].message.tool_calls[0]\n", + "function_name = tool_call.function.name\n", + "function_params = json.loads(tool_call.function.arguments)\n", + "\n", + "# Compute the results\n", + "function_result = names_to_functions[function_name](**function_params)\n", + "\n", + "# Append to the tools response\n", + "messages.append(\n", + " {\n", + " \"role\": \"tool\",\n", + " \"content\": function_result\n", + " }\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a388a4f7-2afa-4d67-ad3b-8a45be3a3517", + "metadata": {}, + "outputs": [], + "source": [ + "# Formulate the final LLM answer based on the provided context\n", + "completion = client.chat.completions.create(\n", + " model=model,\n", + " messages=messages,\n", + " temperature=0,\n", + " tools=[{\"type\": \"function\", \"function\": {\"name\": \"wolfram_alpha\"}}]\n", + ")\n", + "\n", + "print(completion.choices[0].message.content)" + ] + }, + { + "cell_type": "markdown", + "id": "e9cb3be2-fc5b-41ad-8431-c3cfec8bfd5e", + "metadata": {}, + "source": [ + "# Generic function calling\n", + "\n", + "Additionally Llama3.1 excels at function calling as well! See the example below on how one may want to use Llama3.1 to invoke generic external tools." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8a9a6a53-3668-44fa-8b41-6870baf4c50f", + "metadata": {}, + "outputs": [], + "source": [ + "# Mock function to simulate getting flight status\n", + "def get_flight_status(flight_number, date):\n", + " return json.dumps({\"flight_number\": flight_number, \"status\": \"On Time\", \"date\": date})\n", + "\n", + "# Define the function and its parameters to be available for the model\n", + "tools = [\n", + " {\n", + " \"type\": \"function\",\n", + " \"function\": {\n", + " \"name\": \"get_flight_status\",\n", + " \"description\": \"Get the current status of a flight\",\n", + " \"parameters\": {\n", + " \"type\": \"object\",\n", + " \"properties\": {\n", + " \"flight_number\": {\n", + " \"type\": \"string\",\n", + " \"description\": \"The flight number, e.g., AA100\"\n", + " },\n", + " \"date\": {\n", + " \"type\": \"string\",\n", + " \"format\": \"date\",\n", + " \"description\": \"The date of the flight, e.g., 2024-06-17\"\n", + " }\n", + " },\n", + " \"required\": [\"flight_number\", \"date\"]\n", + " }\n", + " }\n", + " }\n", + "]\n", + "\n", + "# Initial conversation setup with the system and user roles\n", + "messages = [\n", + " {\"role\": \"system\", \"content\": \"You are a helpful assistant that can help with flight information and status.\"},\n", + " {\"role\": \"user\", \"content\": \"I have a flight booked for tomorrow with American Airlines, flight number AA100. Can you check its status for me?\"}\n", + "]\n", + "\n", + "# Create a chat completion request with the model, messages, and the tools available to the model\n", + "response = client.chat.completions.create(\n", + " model=model,\n", + " messages=messages,\n", + " tools=tools,\n", + " tool_choice=\"auto\",\n", + " temperature=0\n", + ")\n", + "\n", + "# Extract the agent's response from the API response\n", + "agent_response = response.choices[0].message\n", + "\n", + "# Append the response from the model to keep state in the conversation\n", + "messages.append(\n", + " {\n", + " \"role\": agent_response.role,\n", + " \"content\": \"\",\n", + " \"tool_calls\": [\n", + " tool_call.model_dump()\n", + " for tool_call in response.choices[0].message.tool_calls\n", + " ]\n", + " }\n", + ")\n", + "\n", + "# Process any tool calls made by the model\n", + "tool_calls = response.choices[0].message.tool_calls\n", + "if tool_calls:\n", + " for tool_call in tool_calls:\n", + " function_name = tool_call.function.name\n", + " function_args = json.loads(tool_call.function.arguments)\n", + "\n", + " # Call the function to get the response\n", + " function_response = locals()[function_name](**function_args)\n", + "\n", + " # Add the function response to the messages block\n", + " messages.append(\n", + " {\n", + " \"tool_call_id\": tool_call.id,\n", + " \"role\": \"tool\",\n", + " \"name\": function_name,\n", + " \"content\": function_response,\n", + " }\n", + " )\n", + "\n", + " # Pass the updated messages to the model to get the final enriched response\n", + " function_enriched_response = client.chat.completions.create(\n", + " model=model,\n", + " messages=messages,\n", + " tools=tools,\n", + " tool_choice=\"auto\",\n", + " )\n", + " print(completion.choices[0].message.content)" + ] + }, + { + "cell_type": "markdown", + "id": "6acaeda0-4d78-4fb5-b387-c31b200435a5", + "metadata": {}, + "source": [ + "# Bonus: PhotoGen\n", + "\n", + "Let's see how we can use the function capability to use an LLM to generate an Image with OctoAI." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3dfd0c98-7def-438e-b68f-de7f85c0b6cc", + "metadata": {}, + "outputs": [], + "source": [ + "import requests\n", + "import base64\n", + "from PIL import Image\n", + "\n", + "# Let's define the photogen function that uses OctoAI image generation service to generate an image\n", + "def photogen(query: str) -> str:\n", + " url = \"https://image.octoai.run/generate/sdxl\"\n", + "\n", + " headers = {\n", + " \"Authorization\": f\"Bearer {os.environ[\"OCTOAI_API_KEY\"]}\",\n", + " \"Content-Type\": \"application/json\",\n", + " }\n", + "\n", + " # Define the data payload\n", + " data = {\n", + " \"prompt\": f\"A beautiful image of {query}\",\n", + " \"checkpoint\": \"octoai:lightning_sdxl\",\n", + " \"width\": 1024,\n", + " \"height\": 1024,\n", + " \"num_images\": 1,\n", + " \"sampler\": \"DPM_PLUS_PLUS_SDE_KARRAS\",\n", + " \"steps\": 8,\n", + " \"cfg_scale\": 3,\n", + " \"use_refiner\": False,\n", + " \"style_preset\": \"base\",\n", + " }\n", + "\n", + " # Send the POST request\n", + " response = requests.post(url, headers=headers, data=json.dumps(data))\n", + "\n", + " # Parse the JSON response\n", + " response_json = response.json()\n", + " image_b64 = response_json[\"images\"][0][\"image_b64\"]\n", + "\n", + " # Decode the base64 image and save it to a file\n", + " image_data = base64.b64decode(image_b64)\n", + "\n", + " # write image to disk\n", + " filename = \"result.jpg\"\n", + " with open(filename, \"wb\") as f:\n", + " f.write(image_data)\n", + "\n", + " return filename\n", + "\n", + "# Update our function dictionary\n", + "names_to_functions[\"photogen\"] = photogen" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9aafb279-efc0-4b2f-b977-ea2b9b443036", + "metadata": {}, + "outputs": [], + "source": [ + "messages=[\n", + " {\n", + " \"role\": \"system\",\n", + " \"content\": \"You are a personal assistant with image generation capabilities.\"\n", + " },\n", + " {\n", + " \"role\": \"user\",\n", + " \"content\": \"Generate a photo of a pink gift box on top of a table in a swiss cafe.\"\n", + " }\n", + "]\n", + "\n", + "# First LLM inference\n", + "completion = client.chat.completions.create(\n", + " model=model,\n", + " messages=messages,\n", + " temperature=0,\n", + " tools=[{\"type\": \"function\", \"function\": {\"name\": \"photogen\"}}]\n", + ")\n", + "\n", + "# Append the assistant response to messages\n", + "assistant_response = completion.choices[0].message\n", + "messages.append(\n", + " {\n", + " \"role\": \"assistant\",\n", + " \"content\": \"\",\n", + " \"tool_calls\": completion.choices[0].message.tool_calls\n", + " }\n", + ")\n", + "\n", + "print(json.loads(completion.choices[0].message.tool_calls[0].function.arguments)[\"query\"])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3f9b42b6-55f6-4835-b866-9c3852c963c0", + "metadata": {}, + "outputs": [], + "source": [ + "# Derive the tool call information\n", + "tool_call = completion.choices[0].message.tool_calls[0]\n", + "function_name = tool_call.function.name\n", + "function_params = json.loads(tool_call.function.arguments)\n", + "\n", + "# Compute the results\n", + "function_result = names_to_functions[function_name](**function_params)\n", + "print(function_result)\n", + "\n", + "# Append to the tools response\n", + "messages.append(\n", + " {\n", + " \"role\": \"tool\",\n", + " \"content\": function_result\n", + " }\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "afc2bbe5-651d-4ba1-8498-a773bb6be90f", + "metadata": {}, + "outputs": [], + "source": [ + "# Formulate the final LLM answer based on the provided context\n", + "completion = client.chat.completions.create(\n", + " model=model,\n", + " messages=messages,\n", + " temperature=0,\n", + " tools=[{\"type\": \"function\", \"function\": {\"name\": \"photogen\"}}]\n", + ")\n", + "\n", + "print(completion.choices[0].message.content)\n", + "\n", + "# Display image stored in the file path given by function_result\n", + "from PIL import Image\n", + "Image.open(function_result)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.3" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/recipes/use_cases/tool_calling/builtins/requirements.txt b/recipes/use_cases/tool_calling/builtins/requirements.txt new file mode 100644 index 000000000..637049f4b --- /dev/null +++ b/recipes/use_cases/tool_calling/builtins/requirements.txt @@ -0,0 +1,5 @@ +pillow +jupyter +openai +brave-search +wolframalpha \ No newline at end of file