Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.devcontainer		.devcontainer
.streamlit		.streamlit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval_types.py		eval_types.py
evals.py		evals.py
llms.py		llms.py
logger.py		logger.py
openai_models.py		openai_models.py
requirements.txt		requirements.txt
ui.py		ui.py
vars.csv		vars.csv

Repository files navigation

eva-l

LLM Evaluation Framework

Try the demo

Currently implemented

Open AI model selection
Prompt definition with template variables
Uploading a set of variables (csv)
Eval regex rule
Running evals concurrently and showing rule results in a table

Future plan

Support more LLM models (Anthropic Claude etc.)
Extend prompt to support System + User sections
Define model settings (Temperature etc)
More Eval types (Semantic similarity etc)
Better results visualization. Color for success/fail
Support expandability for models and eval rules

Basic usage example

streamlit-ui-2024-11-22-22-11-36.webm

Local Setup

Clone the repository to your local machine

git clone https://github.com/harlev/eva-l.git
cd eva-l

Create a virtual environment (optional but recommended)

python -m venv env
source env/bin/activate  # On Windows, use `env\Scripts\activate`

Install the required dependencies
pip install -r requirements.txt
Optionally, set your .env file with
OPENAI_API_KEY=<your API key>
Run the Streamlit app
streamlit run ui.py

About

LLM Evaluation Framework

llm-eva-l.streamlit.app/

llm llms llm-eval llm-evaluation

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%