Add JSON structured outputs to Ollama Provider #680

aidando73 · 2024-12-22T04:51:03Z

What does this PR do?

Addresses issue #679

Adds support for the response_format field for chat completions and completions so users can get their outputs in JSON

Test Plan

Integration tests

pytest llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_structured_output -k ollama -s -v

llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_structured_output[llama_8b-ollama] PASSED
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_structured_output[llama_3b-ollama] PASSED

================================== 2 passed, 18 deselected, 3 warnings in 41.41s ==================================

Manual Tests

export INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct
export OLLAMA_INFERENCE_MODEL=llama3.2:3b-instruct-fp16
export LLAMA_STACK_PORT=5000

ollama run $OLLAMA_INFERENCE_MODEL --keepalive 60m
llama stack build --template ollama --image-type conda
llama stack run ./run.yaml \
  --port $LLAMA_STACK_PORT \
  --env INFERENCE_MODEL=$INFERENCE_MODEL \
  --env OLLAMA_URL=http://localhost:11434

    client = LlamaStackClient(base_url=f"http://localhost:{os.environ['LLAMA_STACK_PORT']}")

    MODEL_ID=meta-llama/Llama-3.2-3B-Instruct
    prompt =f"""
        Create a step by step plan to complete the task of creating a codebase that is a web server that has an API endpoint that translates text from English to French.
        You have 3 different operations you can perform. You can create a file, update a file, or delete a file.
        Limit your step by step plan to only these operations per step.
        Don't create more than 10 steps.

        Please ensure there's a README.md file in the root of the codebase that describes the codebase and how to run it.
        Please ensure there's a requirements.txt file in the root of the codebase that describes the dependencies of the codebase.
        """
    response = client.inference.chat_completion(
        model_id=MODEL_ID,
        messages=[
            {"role": "user", "content": prompt},
        ],
        sampling_params={
            "max_tokens": 200000,
        },
        response_format={
            "type": "json_schema",
            "json_schema": {
                "$schema": "http://json-schema.org/draft-07/schema#",
                "title": "Plan",
                "description": f"A plan to complete the task of creating a codebase that is a web server that has an API endpoint that translates text from English to French.",
                "type": "object",
                "properties": {
                    "steps": {
                        "type": "array",
                        "items": {
                            "type": "string"
                        }
                    }
                },
                "required": ["steps"],
                "additionalProperties": False,
            }
        },
        stream=True,
    )

    content = ""
    for chunk in response:
        if chunk.event.delta:
            print(chunk.event.delta, end="", flush=True)
            content += chunk.event.delta

    try:
        plan = json.loads(content)
        print(plan)
    except Exception as e:
        print(f"Error parsing plan into JSON: {e}")
        plan = {"steps": []}

Outputs:

{
    "steps": [
        "Update the requirements.txt file to include the updated dependencies specified in the peer's feedback, including the Google Cloud Translation API key.",
        "Update the app.py file to address the code smells and incorporate the suggested improvements, such as handling errors and exceptions, initializing the Translator object correctly, adding input validation, using type hints and docstrings, and removing unnecessary logging statements.",
        "Create a README.md file that describes the codebase and how to run it.",
        "Ensure the README.md file is up-to-date and accurate.",
        "Update the requirements.txt file to reflect any additional dependencies specified by the peer's feedback.",
        "Add documentation for each function in the app.py file using docstrings.",
        "Implement logging statements throughout the app.py file to monitor application execution.",
        "Test the API endpoint to ensure it correctly translates text from English to French and handles errors properly.",
        "Refactor the code to follow PEP 8 style guidelines and ensure consistency in naming conventions, indentation, and spacing.",
        "Create a new folder for logs and add a logging configuration file (e.g., logconfig.json) that specifies the logging level and output destination.",
        "Deploy the web server on a production environment (e.g., AWS Elastic Beanstalk or Google Cloud Platform) to make it accessible to external users."
    ]
}

Sources

Ollama api docs: https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-completion
Ollama structured output docs: https://github.com/ollama/ollama/blob/main/docs/api.md#request-structured-outputs

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Ran pre-commit to handle lint / formatting issues.
Read the contributor guideline,
Pull Request section?
Updated relevant documentation.
Wrote necessary unit or integration tests.

Add JSON structured outputs to Ollama

da82fb2

aidando73 requested review from ashwinb, yanxi0830, hardikjshah, dltn, raghotham, dineshyv and vladimirivic as code owners December 22, 2024 04:51

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 22, 2024

aidando73 mentioned this pull request Dec 22, 2024

JSON structured outputs for Ollama #679

Open

uncomment

0ffcbb8

aidando73 changed the title ~~Add JSON structured outputs to Ollama~~ Add JSON structured outputs to Ollama Provider Dec 22, 2024

aidando73 mentioned this pull request Dec 22, 2024

Llama code review looping meta-llama/llama-recipes#825

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add JSON structured outputs to Ollama Provider #680

Add JSON structured outputs to Ollama Provider #680

aidando73 commented Dec 22, 2024 •

edited

Loading

Add JSON structured outputs to Ollama Provider #680

Are you sure you want to change the base?

Add JSON structured outputs to Ollama Provider #680

Conversation

aidando73 commented Dec 22, 2024 • edited Loading

What does this PR do?

Test Plan

Sources

Before submitting

aidando73 commented Dec 22, 2024 •

edited

Loading