Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated testset generation question prompts to use JSON formatting instructions + prompt tweaks for smaller LLMs #1354

Closed
wants to merge 4 commits into from

Conversation

fschuh
Copy link

@fschuh fschuh commented Sep 24, 2024

-> This PR is mostly targeted at making the Ragas testset generation prompts work with smaller LLMs that can be run locally.

Some of the testset generation prompts are currently not making use of the JSON format instructions that were added to the eval prompts not too long ago. This PR fixes that.

Also, a few other notable fixes in this PR:

  • Improved testset generation question rewriting prompts to reduce verbosity so they work with smaller and chattier LLMs
  • Improved JSON input rewrite prompt to reduce verbosity so it works with smaller LLMs
  • Fixed incorrect index in one of the examples of find_relevant_contexts prompt (this will make it more explicit that the index is 1-based).

…nk prompts to use JSON formatting instructions.

Also fixed incorrect index in one of the examples of find_relevant_contexts prompt (this will make it more explicit that the index is 1-based).
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Sep 24, 2024
@jjmachan
Copy link
Member

@fschuh thanks a lot for pushing in this PR but we are reworking the testset generation #1321

would love to run the updated version by you and work to get these fixes applied their too if your interested

@fschuh
Copy link
Author

fschuh commented Sep 26, 2024

Sure thing, I can take a look to see if I can apply these fixes on the new experimental testset generation, if they are still relevant of course.
Would it make sense to apply the fixes on both? I still find the old testset generation quite good, despite being a bit slower than I'd like but it still generates good questions.

@jjmachan
Copy link
Member

@shahules786 can you take a look?

@shahules786
Copy link
Member

Hey @fschuh thanks for pushing the PR. I see the changes in prompts, but the changes in prompts are mostly always specific to the model you're using. We understand it was hard to change prompts of Ragas metrics, and testgen components. From v0.2 we will introduce set_prompts and get_prompts interfaces for all the components so that you can modify prompts without installing ragas from the source and then raising a PR. Prompts are fundamentally just like hyperparameters for these algorithms.

@KylinMountain
Copy link

I think we can try to augment json parser which is able to parse the json in markdown or anything else.

@shahules786
Copy link
Member

shahules786 commented Sep 29, 2024

@KylinMountain that's a good idea. I also have been thinking about it. The Prompt object should ideally be able to convert and parse prompts to any format like markdown, XML,etc
This could be too much for 0.2 release, so I'll add this to discussions for 0.3.

@KylinMountain
Copy link

@KylinMountain that's a good idea. I also have been thinking about it. The Prompt object should ideally be able to convert and parse prompts to any format like markdown, XML,etc This could be too much for 0.2 release, so I'll add this to discussions for 0.3.

glad to hear that.

@jjmachan
Copy link
Member

@shahules786 are we merging this in or should we create a new issue to track this problem?

Copy link
Member

@shahules786 shahules786 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjmachan No.
@fschuh If you're are happy with the explanation of why this is not mergeable, please considering closing the PR.

Hey @fschuh thanks for pushing the PR. I see the changes in prompts, but the changes in prompts are mostly always specific to the model you're using. We understand it was hard to change prompts of Ragas metrics, and testgen components. From v0.2 we will introduce set_prompts and get_prompts interfaces for all the components so that you can modify prompts without installing ragas from the source and then raising a PR. Prompts are fundamentally just like hyperparameters for these algorithms.

@jjmachan
Copy link
Member

jjmachan commented Oct 4, 2024

closing this since we have merged get and set prompt #1391

@fschuh would love to get your feedback on they new way and if solves the original problem you were fixing with this PR.

are you on discord?

@jjmachan jjmachan closed this Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants