Skip to content

Commit

Permalink
add how tos
Browse files Browse the repository at this point in the history
  • Loading branch information
shahules786 committed Sep 29, 2024
1 parent 43fe5d2 commit 25f7063
Show file tree
Hide file tree
Showing 2 changed files with 298 additions and 14 deletions.
274 changes: 274 additions & 0 deletions docs/howtos/customisations/metrics/modifying-prompts-metrics.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,274 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "f2be25ec-dad8-47b0-8152-d32fb03cdcf0",
"metadata": {},
"source": [
"## Modifying prompts in metrics\n",
"\n",
"Every metrics in ragas that uses LLM also uses one or more prompts to come up with intermediate results that is used for formulating scores. Prompts can be treated like hyperparameters when using LLM based metrics. An optimised prompt that suits your domain and use-case can increase the accuracy of your LLM based metrics by 10-20%. An optimal prompt is also depended on the LLM one is using, so as users you might want to tune prompts that powers each metric. \n",
"\n",
"Each prompt in Ragas is written using [Prompt Object](../). Please make sure you have an understanding of it before going further."
]
},
{
"cell_type": "markdown",
"id": "7bb78d22-4e13-46e6-98f8-fefd5bca9461",
"metadata": {},
"source": [
"### Understand the prompts of your Metric\n",
"\n",
"Since Ragas treats prompts like hyperparameters in metrics, we have a unified interface of `get_prompts` to access prompts used underneath any metrics. "
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "02a1fbc7-a48d-48f9-a6be-9449a1399ae5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'multi_turn_prompt': <ragas.metrics._simple_criteria.MultiTurnSimpleCriteriaWithReferencePrompt at 0x7f8c41410970>,\n",
" 'single_turn_prompt': <ragas.metrics._simple_criteria.SingleTurnSimpleCriteriaWithReferencePrompt at 0x7f8c41412590>}"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from ragas.metrics._simple_criteria import SimpleCriteriaScoreWithReference\n",
"scorer = SimpleCriteriaScoreWithReference(name='random',definition=\"some definition\")\n",
"scorer.get_prompts()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "25dadd55-316c-4bed-8fe5-a9d2525ad29c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Your task is to judge the faithfulness of a series of statements based on a given context. For each statement you must return verdict as 1 if the statement can be directly inferred based on the context or 0 if the statement can not be directly inferred based on the context.\n"
]
}
],
"source": [
"prompts = scorer.get_prompts()\n",
"print(prompts['single_turn_prompt'].to_string())"
]
},
{
"cell_type": "markdown",
"id": "c3333fd8-252d-4710-bfc3-612a61e75ee8",
"metadata": {},
"source": [
"### Modifying instruction in default prompt\n",
"It is highly likely that one might to modify the prompt to suit ones needs. Ragas provides `set_prompts` methods to allow you to do so. Let's change the one of the prompts used in `FactualCorrectness` metrics"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "e5488da1-4b57-49e2-822e-80b3d3cad436",
"metadata": {},
"outputs": [],
"source": [
"prompt = scorer.get_prompts()['single_turn_prompt']\n",
"prompt.instruction += \"\\nOnly output valid JSON.\""
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "328c4346-6f75-4790-9ea0-dd6e2243c267",
"metadata": {},
"outputs": [],
"source": [
"scorer.set_prompts(**{\n",
" 'single_turn_prompt':prompt\n",
"})"
]
},
{
"cell_type": "markdown",
"id": "e15ca95d-621c-4dea-8c48-660914b5f34d",
"metadata": {},
"source": [
"Let's check if the prompt's instruction has actually changed"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "3927c09e-a802-4d11-be91-35a2ebc811e9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Given a input, system response and reference. Evaluate and score the response against the reference only using the given criteria.\n",
"Only output valid JSON.\n",
"Only output valid JSON.\n"
]
}
],
"source": [
"print(scorer.get_prompts()['single_turn_prompt'].instruction)"
]
},
{
"cell_type": "markdown",
"id": "abfe4d3f-f579-4216-889e-04e2db22aade",
"metadata": {},
"source": [
"### Modifying examples in default prompt\n",
"Few shot examples can greatly influence the outcome of any LLM. It is highly likely that the examples in default prompt may not reflect your domain or use-case. So it's always an good practice to modify with your custom examples. Let's do one here"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "ce261ef5-18f3-40a4-950c-8dea83240fbf",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(SingleTurnSimpleCriteriaWithReferenceInput(user_input='Who was the director of Los Alamos Laboratory?', response='Einstein was the director of Los Alamos Laboratory.', criteria='Score responses in range of 0 (low) to 5 (high) based similarity with reference.', reference='The director of Los Alamos Laboratory was J. Robert Oppenheimer.'),\n",
" SimpleCriteriaOutput(reason='The response and reference have two very different answers.', score=0))]"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"prompt = scorer.get_prompts()['single_turn_prompt']\n",
"\n",
"prompt.examples"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "6da7e53e-6b88-42ca-9b29-caff4729adf0",
"metadata": {},
"outputs": [],
"source": [
"from ragas.metrics._simple_criteria import SingleTurnSimpleCriteriaWithReferenceInput, SimpleCriteriaOutput"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "0cbc1d41-4c0e-49a1-aa62-eb02a9780cac",
"metadata": {},
"outputs": [],
"source": [
"new_example = [\n",
" (\n",
" SingleTurnSimpleCriteriaWithReferenceInput(\n",
" user_input='Who was the first president of the United States?',\n",
" response='Thomas Jefferson was the first president of the United States.',\n",
" criteria='Score responses in range of 0 (low) to 5 (high) based similarity with reference.',\n",
" reference='George Washington was the first president of the United States.'\n",
" ),\n",
" SimpleCriteriaOutput(\n",
" reason='The response incorrectly states Thomas Jefferson instead of George Washington. While both are significant historical figures, the answer does not match the reference.',\n",
" score=2\n",
" )\n",
" )\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "f1ba76f5-97b8-412f-81ca-066e6a256e58",
"metadata": {},
"outputs": [],
"source": [
"prompt.examples = new_example"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "73ef1ecf-27a2-4acf-b838-9b1fd28363e6",
"metadata": {},
"outputs": [],
"source": [
"scorer.set_prompts(**{\n",
" 'single_turn_prompt':prompt\n",
"})"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "6599a64e-9313-41ee-b15c-e6d16819b9e7",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[(SingleTurnSimpleCriteriaWithReferenceInput(user_input='Who was the first president of the United States?', response='Thomas Jefferson was the first president of the United States.', criteria='Score responses in range of 0 (low) to 5 (high) based similarity with reference.', reference='George Washington was the first president of the United States.'), SimpleCriteriaOutput(reason='The response incorrectly states Thomas Jefferson instead of George Washington. While both are significant historical figures, the answer does not match the reference.', score=2))]\n"
]
}
],
"source": [
"print(scorer.get_prompts()['single_turn_prompt'].examples)"
]
},
{
"cell_type": "markdown",
"id": "fe5187a1-b238-4e3e-a6bb-181d88743d42",
"metadata": {},
"source": [
"Let's now view and verify the full new prompt with modified instruction and examples"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f748f962-1fdf-42cb-b597-38fe65954460",
"metadata": {},
"outputs": [],
"source": [
"scorer.get_prompts()['single_turn_prompt'].to_string()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "ragas",
"language": "python",
"name": "ragas"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
38 changes: 24 additions & 14 deletions docs/howtos/index.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,31 @@
# 🛠️ How-to Guides

Each guide in this section provides a focused solution to real-world problems that you, as an experienced user, may encounter while using Ragas. These guides are designed to be concise and direct, offering quick solutions to your problems. We assume you have a foundational understanding and are comfortable with Ragas concepts. If not, feel free to explore the [Get Started](../getstarted/index.md) section first.

The how-to guides offer a more comprehensive overview of all the tools Ragas
offers and how to use them. This will help you tackle messier real-world
usecases when you're using the framework to help build your own RAG pipelines.
<div class="grid cards" markdown>

The guides assume you are familiar and confortable with the Ragas basics. If
your not feel free to checkout the [Get Started](../getstarted/index.md)
sections first.
- :material-tune:{ .lg .middle } [__Customization__](customisations/index.md)

These guides are organized into 3 sections
---

- [Customisations](./customisations/index.md): How to customise Ragas to use with your system. Things
like changing the LLMs used underneath are covered here.
- [Applications](./applications/index.md): Shows how to leverage Ragas framework to solve various
real-world problems that you might be facing too.
- [Integrations](./integrations/index.md): Integrations with other tools like langchain and llamaindex,
making it super easy to add ragas into your setup.
How to customize various aspects of Ragas to suit your needs.

Customize features such as [Metrics](customisations/testgenerator/) and [Testset Generation](customisations/metrics/).

You can check the entire list bellow 👇
- :material-cube-outline:{ .lg .middle } [__Applications__](applications/index.md)

---

How to use Ragas for various applications and use cases.

Includes applications such as [RAG evaluation](applications/index.md).

- :material-link-variant:{ .lg .middle } [__Integrations__](integrations/index.md)

---

How to integrate Ragas with other frameworks and observability tools.

Use Ragas with frameworks like [Langchain](integrations/langchain.md), [LlamaIndex](integrations/llamaindex.md), and [observability tools]().

</div>

0 comments on commit 25f7063

Please sign in to comment.