forked from BerriAI/litellm
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Litellm dev 2024 12 19 p3 (BerriAI#7322)
* fix(utils.py): remove unsupported optional params (if drop_params=True) before passing into map openai params Fixes BerriAI#7242 * test: new test for langfuse prompt management hook Addresses BerriAI#3893 (comment) * feat(main.py): add 'get_chat_completion_prompt' customlogger hook allows for langfuse prompt management Addresses BerriAI#3893 (comment) * feat(langfuse_prompt_management.py): working e2e langfuse prompt management works with `langfuse/` route * feat(main.py): initial tracing for dynamic langfuse params allows admin to specify langfuse keys by model in model_list * feat(main.py): support passing langfuse credentials dynamically * fix(langfuse_prompt_management.py): create langfuse client based on dynamic callback params allows dynamic langfuse params to work * fix: fix linting errors * docs(prompt_management.md): refactor docs for sdk + proxy prompt management tutorial * docs(prompt_management.md): cleanup doc * docs: cleanup topnav * docs(prompt_management.md): update docs to be easier to use * fix: remove unused imports * docs(prompt_management.md): add architectural overview doc * fix(litellm_logging.py): fix dynamic param passing * fix(langfuse_prompt_management.py): fix linting errors * fix: fix linting errors * fix: use typing_extensions for typealias to ensure python3.8 compatibility * test: use stream_options in test to account for tiktoken diff * fix: improve import error message, and check run test earlier
- Loading branch information
1 parent
2c36f25
commit 27a4d08
Showing
17 changed files
with
631 additions
and
243 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,83 +1,212 @@ | ||
import Image from '@theme/IdealImage'; | ||
import Tabs from '@theme/Tabs'; | ||
import TabItem from '@theme/TabItem'; | ||
|
||
# Prompt Management | ||
|
||
LiteLLM supports using [Langfuse](https://langfuse.com/docs/prompts/get-started) for prompt management on the proxy. | ||
Run experiments or change the specific model (e.g. from gpt-4o to gpt4o-mini finetune) from your prompt management tool (e.g. Langfuse) instead of making changes in the application. | ||
|
||
Supported Integrations: | ||
- [Langfuse](https://langfuse.com/docs/prompts/get-started) | ||
|
||
## Quick Start | ||
|
||
1. Add Langfuse as a 'callback' in your config.yaml | ||
|
||
<Tabs> | ||
|
||
<TabItem value="sdk" label="SDK"> | ||
|
||
```python | ||
import os | ||
import litellm | ||
|
||
os.environ["LANGFUSE_PUBLIC_KEY"] = "public_key" # [OPTIONAL] set here or in `.completion` | ||
os.environ["LANGFUSE_SECRET_KEY"] = "secret_key" # [OPTIONAL] set here or in `.completion` | ||
|
||
litellm.set_verbose = True # see raw request to provider | ||
|
||
resp = litellm.completion( | ||
model="langfuse/gpt-3.5-turbo", | ||
prompt_id="test-chat-prompt", | ||
prompt_variables={"user_message": "this is used"}, # [OPTIONAL] | ||
messages=[{"role": "user", "content": "<IGNORED>"}], | ||
) | ||
``` | ||
|
||
|
||
|
||
</TabItem> | ||
<TabItem value="proxy" label="PROXY"> | ||
|
||
1. Setup config.yaml | ||
|
||
```yaml | ||
model_list: | ||
- model_name: gpt-3.5-turbo | ||
litellm_params: | ||
model: azure/chatgpt-v-2 | ||
api_key: os.environ/AZURE_API_KEY | ||
api_base: os.environ/AZURE_API_BASE | ||
|
||
litellm_settings: | ||
callbacks: ["langfuse"] # 👈 KEY CHANGE | ||
model: langfuse/gpt-3.5-turbo | ||
prompt_id: "<langfuse_prompt_id>" | ||
api_key: os.environ/OPENAI_API_KEY | ||
``` | ||
2. Start the proxy | ||
```bash | ||
litellm-proxy --config config.yaml | ||
litellm --config config.yaml --detailed_debug | ||
``` | ||
|
||
3. Test it! | ||
|
||
<Tabs> | ||
<TabItem value="curl" label="CURL"> | ||
|
||
```bash | ||
curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \ | ||
-H 'Content-Type: application/json' \ | ||
-H 'Authorization: Bearer sk-1234' \ | ||
-d '{ | ||
"model": "gpt-4", | ||
"model": "gpt-3.5-turbo", | ||
"messages": [ | ||
{ | ||
"role": "user", | ||
"content": "THIS WILL BE IGNORED" | ||
} | ||
], | ||
"metadata": { | ||
"langfuse_prompt_id": "value", | ||
"langfuse_prompt_variables": { # [OPTIONAL] | ||
"key": "value" | ||
} | ||
"prompt_variables": { | ||
"key": "this is used" | ||
} | ||
}' | ||
``` | ||
</TabItem> | ||
<TabItem value="OpenAI Python SDK" label="OpenAI Python SDK"> | ||
|
||
```python | ||
import openai | ||
client = openai.OpenAI( | ||
api_key="anything", | ||
base_url="http://0.0.0.0:4000" | ||
) | ||
|
||
# request sent to model set on litellm proxy, `litellm --model` | ||
response = client.chat.completions.create( | ||
model="gpt-3.5-turbo", | ||
messages = [ | ||
{ | ||
"role": "user", | ||
"content": "this is a test request, write a short poem" | ||
} | ||
], | ||
extra_body={ | ||
"prompt_variables": { # [OPTIONAL] | ||
"key": "this is used" | ||
} | ||
} | ||
) | ||
|
||
print(response) | ||
``` | ||
|
||
</TabItem> | ||
</Tabs> | ||
|
||
</TabItem> | ||
</Tabs> | ||
|
||
|
||
**Expected Logs:** | ||
|
||
``` | ||
POST Request Sent from LiteLLM: | ||
curl -X POST \ | ||
https://api.openai.com/v1/ \ | ||
-d '{'model': 'gpt-3.5-turbo', 'messages': <YOUR LANGFUSE PROMPT TEMPLATE>}' | ||
``` | ||
|
||
## How to set model | ||
|
||
### Set the model on LiteLLM | ||
|
||
## What is 'langfuse_prompt_id'? | ||
You can do `langfuse/<litellm_model_name>` | ||
|
||
- `langfuse_prompt_id`: The ID of the prompt that will be used for the request. | ||
<Tabs> | ||
<TabItem value="sdk" label="SDK"> | ||
|
||
```python | ||
litellm.completion( | ||
model="langfuse/gpt-3.5-turbo", # or `langfuse/anthropic/claude-3-5-sonnet` | ||
... | ||
) | ||
``` | ||
|
||
</TabItem> | ||
<TabItem value="proxy" label="PROXY"> | ||
|
||
```yaml | ||
model_list: | ||
- model_name: gpt-3.5-turbo | ||
litellm_params: | ||
model: langfuse/gpt-3.5-turbo # OR langfuse/anthropic/claude-3-5-sonnet | ||
prompt_id: <langfuse_prompt_id> | ||
api_key: os.environ/OPENAI_API_KEY | ||
``` | ||
</TabItem> | ||
</Tabs> | ||
### Set the model in Langfuse | ||
If the model is specified in the Langfuse config, it will be used. | ||
<Image img={require('../../img/langfuse_prompt_management_model_config.png')} /> | ||
```yaml | ||
model_list: | ||
- model_name: gpt-3.5-turbo | ||
litellm_params: | ||
model: azure/chatgpt-v-2 | ||
api_key: os.environ/AZURE_API_KEY | ||
api_base: os.environ/AZURE_API_BASE | ||
``` | ||
## What is 'prompt_variables'? | ||
- `prompt_variables`: A dictionary of variables that will be used to replace parts of the prompt. | ||
|
||
|
||
|
||
## What is 'prompt_id'? | ||
|
||
- `prompt_id`: The ID of the prompt that will be used for the request. | ||
|
||
<Image img={require('../../img/langfuse_prompt_id.png')} /> | ||
|
||
## What will the formatted prompt look like? | ||
|
||
### `/chat/completions` messages | ||
|
||
The message will be added to the start of the prompt. | ||
The `messages` field sent in by the client is ignored. | ||
|
||
- if the Langfuse prompt is a list, it will be added to the start of the messages list (assuming it's an OpenAI compatible message). | ||
The Langfuse prompt will replace the `messages` field. | ||
|
||
- if the Langfuse prompt is a string, it will be added as a system message. | ||
To replace parts of the prompt, use the `prompt_variables` field. [See how prompt variables are used](https://github.com/BerriAI/litellm/blob/017f83d038f85f93202a083cf334de3544a3af01/litellm/integrations/langfuse/langfuse_prompt_management.py#L127) | ||
|
||
```python | ||
if isinstance(compiled_prompt, list): | ||
data["messages"] = compiled_prompt + data["messages"] | ||
else: | ||
data["messages"] = [ | ||
{"role": "system", "content": compiled_prompt} | ||
] + data["messages"] | ||
``` | ||
If the Langfuse prompt is a string, it will be sent as a user message (not all providers support system messages). | ||
|
||
### `/completions` messages | ||
If the Langfuse prompt is a list, it will be sent as is (Langfuse chat prompts are OpenAI compatible). | ||
|
||
The message will be added to the start of the prompt. | ||
## Architectural Overview | ||
|
||
```python | ||
data["prompt"] = compiled_prompt + "\n" + data["prompt"] | ||
``` | ||
<Image img={require('../../img/prompt_management_architecture_doc.png')} /> | ||
|
||
## API Reference | ||
|
||
These are the params you can pass to the `litellm.completion` function in SDK and `litellm_params` in config.yaml | ||
|
||
``` | ||
prompt_id: str # required | ||
prompt_variables: Optional[dict] # optional | ||
langfuse_public_key: Optional[str] # optional | ||
langfuse_secret: Optional[str] # optional | ||
langfuse_secret_key: Optional[str] # optional | ||
langfuse_host: Optional[str] # optional | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.