Bringing Evals into the Langchain::Assistant #853
andreibondarev
started this conversation in
Ideas
Replies: 2 comments 2 replies
-
@bborn I created this discussion thread to talk about how we could integrate the evals. Maybe we could flesh things out here before implementing? |
Beta Was this translation helpful? Give feedback.
1 reply
-
@bborn Take a glance: #855 (comment) |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
It's time to introduce a light-weight way to run evals on the Langchain::Assistant execution output. Regardless of what kind of metrics are being evaluated I'd like to figure out a good DSL for how it's integrated into the Langchain::Assistant.
Agent interactions are generally:
Given a collection of AI agent inputs and corresponding ideal outputs -- we should be able to run our AI agent through this dataset and compare.
A few questions to consider:
Beta Was this translation helpful? Give feedback.
All reactions