Skip to content

Commit

Permalink
Posts using-llms-in-production
Browse files Browse the repository at this point in the history
  • Loading branch information
davidhariri committed Apr 9, 2024
1 parent c2aa865 commit addb73b
Show file tree
Hide file tree
Showing 2 changed files with 101 additions and 0 deletions.
96 changes: 96 additions & 0 deletions posts/using-llms-in-production.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
---
title: Using LLMs in Production
description: A nod to Will Larson's post on using LLMs in production and some additional notes based on my own experience.
date: 2024-04-09 11:45:00-0400
tags:
- LLMs
- ML
- code
---

[Will Larson](https://lethain.com) just wrote about his mental models for [using LLMs in production](https://lethain.com/mental-model-for-how-to-use-llms-in-products/). I agree with much of it, particularly the re-framing of what LLMs can _really do today_ for product developers.

## On the unsupervised (no human in the loop) scenario

> Because you cannot rely on LLMs to provide correct responses, and you cannot generate a confidence score for any given response, you have to either accept potential inaccuracies (which makes sense in many cases, humans are wrong sometimes too) or keep a Human-in-the-Loop (HITL) to validate the response.
I only wish the post touched more on the unsupervised (no human in the loop) scenario. For many workflows, an LLM and human in the loop means the workflow is only marginally improved. To make systems that are autonomous it's not just about accepting potential inaccuracies, it's also about accepting responsibility for _driving them down_. This is the super hard part about unsupervised LLM application. You have to first educate customers on the trade-offs and risks they are taking and then you have to build systems that drive those risks to 0 and optimize those trade offs for value so that customers become increasingly confident in the system.

## Using schemas in prompts

A tactic that wasn't mentioned in the post is using [`JSONSchema`](https://json-schema.org/) within LLM prompts. This is a great way to ensure generations are more accurate and meet your systems expectations.

<blockquote class="callout note">
You don't need to use JSONSchema if you don't want to. We have had good results from simply showing a few examples of desired output in the prompt and letting the LLM infer the schema from that.
</blockquote>

Here's a toy example of how you can use `JSONSchema` with an LLM prompt:

```py
import openai
from pydantic import BaseModel
from typing import Literal

# Example docs
docs = [
{
'id': 1,
'content': 'This is a well formed sentence that has no errors in grammar or spelling.'
},
{
'id': 2,
'content': 'This is an exampel sentence errors in grammer and speling.'
}
]

# Define your schema with the LLM. We're using pydantic, but there are many options for this.
class DocumentReview(BaseModel):
# A document review
document_id: int
review: Literal['good', 'bad']

# Make a prompt that uses the schema
prompt = f"""Given the following <document>, please review the document and provide your review using the provided JSONSchema:
<document id="{doc['id']}">
{doc['content']}
</document>
JSONSchema:
{schema}
Your review:
"""

# Assuming openai API key is set in environment variables
openai.api_key = os.getenv("OPENAI_API_KEY")

response = openai.Completion.create(
engine="text-davinci-003",
prompt=prompt.format(
doc=docs[0],
schema=DocumentReview.model_json_schema()
),
)

# Assuming the LLM returns a JSON string that fits our schema
try:
review = DocumentReview.model_validate_json(response.choices[0].text.strip())
except ValidationError as e:
print(f"Error validating schema: {e}")
return

# Check that the document ID matches the document ID in the docs
print(f"Document ID: {review.document_id}, Review: {review.review}")
```

**Handling `ValidationError`**

1. You can handle the `ValidationError` by re-prompting the LLM with the error and re-running the prompt.
2. You can also handle the `ValidationError` by dropping down to a HITL using a queue of documents to review.

---

Using schemas to validate generations allows you to ensure the data generated by an LLM at least matches your data types. In addition, if the LLM is referencing passed material (such as in a RAG architecture) you can ensure the document IDs referenced in generations at least match the source documents given. To improve this further, you can perform some semantic/string distance checks to the source documents' content and the outputted generation.

For more on using `JSONSchema` with LLM prompts, see [this post](https://thoughtbot.com/blog/get-consistent-data-from-your-llm-with-json-schema) from ThoughtBot.
5 changes: 5 additions & 0 deletions static/styles.css
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,10 @@ time {
color: var(--description-color);
}

hr {
margin: 2rem 0;
}

footer {
display: block;
align-self: bottom;
Expand Down Expand Up @@ -125,6 +129,7 @@ blockquote {
padding: 0 1rem;
color: var(--description-color);
border-left: 0.25rem solid var(--description-color);
margin-bottom: 1rem;
}

blockquote a,
Expand Down

0 comments on commit addb73b

Please sign in to comment.