LLM Evaluation Framework
Try the demo
- Open AI model selection
- Prompt definition with template variables
- Uploading a set of variables (csv)
- Eval regex rule
- Running evals concurrently and showing rule results in a table
- Support more LLM models (Anthropic Claude etc.)
- Extend prompt to support System + User sections
- Define model settings (Temperature etc)
- More Eval types (Semantic similarity etc)
- Better results visualization. Color for success/fail
- Support expandability for models and eval rules
streamlit-ui-2024-11-22-22-11-36.webm
- Clone the repository to your local machine
git clone https://github.com/harlev/eva-l.git
cd eva-l
- Create a virtual environment (optional but recommended)
python -m venv env
source env/bin/activate # On Windows, use `env\Scripts\activate`
- Install the required dependencies
pip install -r requirements.txt
- Optionally, set your
.env
file with
OPENAI_API_KEY=<your API key>
- Run the Streamlit app
streamlit run ui.py