Welcome to Ragrank! This toolkit is designed to assist you in evaluating the performance of your Retrieval-Augmented Generation (RAG) applications. You will get proper metrics for evaluate RAG model. The product is still in beta
stage.
Ragrank is available as a PyPi package. To install it, simply run:
pip install ragrank
If you prefer to install it from the source:
git clone https://github.com/Auto-Playground/ragrank.git && cd ragrank
poetry install
Set your OPENAI_API_KEY
as an environment variable (you can also evaluate using your own custom model, refer docs):
export OPENAI_API_KEY="..."
Here's a quick example of how you can use Ragrank to evaluate the relevance of generated responses:
from ragrank import evaluate
from ragrank.dataset import from_dict
from ragrank.metric import response_relevancy
# Define your dataset
data = from_dict({
"question": "What is the capital of France?",
"context": ["France is famous for its iconic landmarks such as the Eiffel Tower and its rich culinary tradition."],
"response": "The capital of France is Paris.",
})
# Evaluate the response relevance metric
result = evaluate(data, metrics=[response_relevancy])
# Display the evaluation results
result.to_dataframe()
For more information on how to use Ragrank and its various features, please refer to the documentation. 📚
This project is licensed under the Apache License. Feel free to use and modify it according to your needs.
If you encounter any issues, have questions, or would like to provide feedback, please don't hesitate to open an issue on the GitHub repository. Your contributions and suggestions are highly appreciated!
Join our community on Discord to connect with other users, ask questions, and share your experiences with Ragrank. We're here to help you make the most out of your NLP projects! 💬
Happy evaluating! 🙂