-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom Prompts via YAML #37
Custom Prompts via YAML #37
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finished the review of the code, and it looks great, @falquaddoomi. I've added a couple minor comments. Now, I'm gonna run/test it and report back.
# FIXME: which takes priority, the files collection in ai_revision-config.yaml | ||
# or the prompt_file? we went with config taking precendence for now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that only one of both should be present: either 1) prompt_files
key in file ai_revision-prompts.yaml
or 2) having a files
/matchings
key in ai_revision-config.yaml
. But the user could add both, and I agree that the config should take precedence. What if we print a warning here? It might be useful for the user.
Deprecated: This testing procedure is now part of the PR description Here, I describe the live testing procedure using this feature through GitHub Actions and the Setting up the Manubot AI EditorThese steps create the conda environment, install Manubot, and override the Manubot AI Editor package using this branch's version. gh pr checkout 37
conda env create -f environment.yml -n manubot-ai-editor-testing
conda activate manubot-ai-editor-testing
pip install --upgrade manubot[ai-rev]
pip install -e . Export your OpenAI API key: export OPENAI_API_KEY="<api_key>" Unit testsRun all the unit tests: pytest --runcost tests/ All unit tests should have passed. Manuscripts used for testingI've tested this new version of the Manubot AI Editor using the original three manuscripts we used in the preprint. For each of these repositories, I've created different branches that reflect the different test cases present in folder
For convenience, I put here the git command to clone each manuscript repo: git clone [email protected]:pivlab/phenoplier-manuscript-test.git
git clone [email protected]:pivlab/ccc-manuscript-test.git
git clone [email protected]:pivlab/manubot-gpt-manuscript-test.git Make sure also that there is a repository secret named Local tests: DebuggingManuscriptRevisionModelBefore hitting the OpenAI's API, I ran a local revision model ( First, clone one of the manuscripts above, such as PhenoPLIER, and checkout one of the test cases branches. cd <MANUSCRIPT_REPO_DIR>
git co <TESTING_BRANCH> Run using a local model for debugging: manubot ai-revision \
--content-directory content/ \
--model-type DebuggingManuscriptRevisionModel Then, I checked the following:
GitHub Actions tests (uses GPT3CompletionModel)Setting up workflowMake sure each manuscript's - name: Install Manubot AI revision dependencies
run: |
# install using the same URL used for manubot in build/environment.yml
manubot_line=$(grep "github.com/manubot/manubot" build/environment.yml)
manubot_url=$(echo "$manubot_line" | awk -F"- " '{print $2}')
pip install ${manubot_url}#egg=manubot[ai-rev]
# install manubot-ai-editor for this PR
pip install -U git+https://github.com/falquaddoomi/manubot-ai-editor@issue-31-customprompts-yaml Kicking off workflowFor the live tests using the OpenAI models:
Assertions
|
…on via YAML config files.
…anuscript(), passed down to model-specific prompt resolution via the 'resolved_prompt' argument.
…t if it's explicitly ignored. revise_manuscript() now checks for this sentinel and ignores the file.
…script folders and config folders temporarily for testing. Adds set_directory() helper to temporarily switch folders.
… on the phenoplier full manuscript for the "both files" scenario.
we are doing prompt testing separately now
…ed as a literal string rather than an index into prompts
…fied. Adds tests to verify the warning is shown.
…erence as a reference into prompts, not a literal.
…n to all invocations of model.get_prompt()
…t was with the other prompts.
… adds a title so we can test the placeholder replacement of {title}.
…t be literal text; added the default prompt where it was missing.
…lls in GPT3CompletionModel.get_prompt()
…ManuscriptRevisionModel and then looks for the prompts' text in the resulting .md files
…r keyword args, since we typically want to customize them
Ah, good catch! Yes, let's remove that code and keep the configuration in one place. Thank you! |
…ply title, keywords.
…ince it's no longer hardcoded as ignored
Hey @miltondp, I made a few changes that should check for the correct title and keyword replacements in the prompts. Feel free to run your tests again, and thank you. FYI, the code that does the replacements is here, in case there's something missing that should be replaced: manubot-ai-editor/libs/manubot_ai_editor/models.py Lines 314 to 322 in f9f89d1
It's pretty much identical to the other parts, and probably should be refactored to just use that replacement dictionary now that I'm looking at it again. In fact, I'm going to go ahead and change that, since there's no reason to compute the replacement dict twice. EDIT: the commit after this comment factors it into a shared dict. |
…s so that tests that don't specify one will pass
Testing resultsCurrent status (5/6/2024): I followed the procedure described in this comment. History of changes
Tests
|
Thank you for the testing results, @falquaddoomi. You followed this testing procedure, right? (Your link to the comment does not work). Some comments for the tests on the
|
Sorry about that; somehow the PR number got dropped from the link, but I updated it and verified that the link is now working. Anyway, yes, I did follow your testing procedure from the now corrected link.
No, that line you linked to references a commit that's older than the fix I introduced at commit manubot/manubot@06fcecb. You'd have to update that line to the following:
I've tested the
Well, I didn't want to overwrite your existing PRs so that you could compare the two. Sure, though, I'll just go ahead and run it without the suffix, no problem. |
Perfect! Thank you for checking that. Do you want to create a PR on the three manuscript repos so we update the Manubot version on them? Once we merge this change to use the fixed Manubot version, we can update the branches (with the tests) on each manuscript repository so all these local tests would pass.
Ah, of course, I see. No problem with overriding these PRs, we can keep overriding them until the tests pass. Do you want to finish the other GitHub Actions tests on the PhenoPLIER manuscript test repository? Maybe we can first fix the manubot version (comment above), and then create one branch per each of the unit tests that you wrote in the file |
Sure thing. I created the PRs (below), but then I realized that you didn't specify which branch/repo you wanted them created against; I assumed it was the
Actually, I see your point about not creating new branches per test run; otherwise the repo will be littered with these branches, and you can always see the history in the single branch per test. I'm going to go ahead and delete the
Sure, I can run the remaining GitHub Actions. I'll wait for those PRs above to get merged, then I'll rebase each of the testing branches onto On a side note, I noticed that the phenoplier manuscript test repo only has two of the branches you mentioned,
I think that sounds reasonable, although if it's ok with you I'd prefer that you create these evaluation criteria, since I assume you have a better idea of what you intended with the spec and the larger context in which
I totally agree on moving the testing procedure from a comment in the middle of our discussion to either the wiki, or perhaps to the description of this PR so we don't have to search for it. Regarding the testing results, my personal opinion is that they're part of the discussion relating to this PR so I wouldn't mind continuing to post them as comments, but I'm open to moving them somewhere more convenient for you if this is getting unwieldy. |
Perfect. I approved all the PRs on these manuscripts, so you can merge them and then rebase the test branches.
Exactly.
Perfect.
👍🏻
Yes, that's what I meant: if you can create those branches based on your unit tests. Sorry for the confusion. Feel free to assess whether we want to add all the unit tests or just a subset. Since the testing on GitHub Actions is more time-consuming for us, it would make sense to include only tests relevant to this context (GitHub Actions) since they are already being tested via unit tests.
Yes, absolutely. I'll add these after you merge your PRs (with the manubot version update) into the manuscripts.
Yes, I can work on the criteria for testing and then we can review it together. Once we feel we are covering all the functionality that we want to cover, we can move on to the other manuscripts.
Perfect, I like that. Do you want to move the testing procedure to the PR description? Looks like everyone can edit the PR description, so that's great. I would add, at the top, a "history of changes" (including dates) that gets updated when major changes are included. Regarding the testing results, I like also the idea of having them posted as comments, but I would probably keep a single one, also with a history of changes (including dates). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, and I think it's ready for merging and a new release.
…ompts with a single default
…te doc on how to use the custom prompts system.
…RRIDE is specified, added test for when DEFAULT_PROMPT_OVERRIDE is specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I left some edits and comments for you to revise. But this is good to merge already.
docs/custom-prompts.md
Outdated
- `{section_name}`: the name of the section (e.g., "introduction", "conclusion"), derived from the filename* | ||
|
||
*(\* The mechanism that produces `section_name` is out of the scope of this document, but you can find the implementation in [editor.py](https://github.com/falquaddoomi/manubot-ai-editor/blob/issue-31-customprompts-yaml/libs/manubot_ai_editor/editor.py#L178-L211))* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have a set of fixed, defined sections, right? If the file name indicates any of them, we return one of this fixed set of values. I know you put a link to the implementation, but it might be useful to mention this set of fixed section values here (I left a suggestion above). Also, the link points to your personal repository instead of the official one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I committed your suggestion, thanks for that. Regarding the link, I linked to my PR (which is in my personal repo) because once this gets merged the line references will be out of date. I think what I'm going to do instead is just describe the mechanism by which it's resolved here, since it's not all that complicated, and just forego linking to the source.
Co-authored-by: Milton Pividori <[email protected]>
…e to none to run, rather than just skipping them if it's not none
This PR addresses #31. Specifically it adds support for parsing two new configuration files,
ai_revision-config.yaml
andai_revision-prompts.yaml
that specify how filenames are mapped to prompts, and what the text of those prompts are.In the following, I refer to determining the prompt for a given piece of text as "prompt resolution". The current code supports the following mechanisms for prompt resolution:
AI_EDITOR_CUSTOM_PROMPT
environment variable; if specified, this text is used as-is as the prompt.models.GPT3CompletionModel.get_prompt()
This PR adds a YAML-based filename to prompt resolution mechanism, detailed in issue #31. The mechanism is implemented in the class
prompt_config.ManuscriptPromptConfig
.Usage:
Once instantiated, an instance of the
ManuscriptPromptConfig
class can be used to resolve a prompt based on a filename via theManuscriptPromptConfig.get_prompt_for_filename()
method. The method takesfilename
as an argument, which it then uses to consult its configuration for a matching prompt. The method returns a tuple like(prompt : str|None, match : re.match|None)
, where:prompt
is the prompt text matching the filename, orNone
if the file was explicitly ignored in the config or if no match was found, andmatch
is the resulting re.Match instance from when the filename was applied to a regex expression. In cases where a regex wasn't applied (e.g. if we're returning the default prompt), this value isNone
.The optional
use_default
argument (True
by default) allows the method to return a default prompt specified in the config files if no match was found. If it'sFalse
and no match was found,(None, None)
is returned.Integration into the existing code:
An instance of this class is created in
ManuscriptEditor
's constructor asself.prompt_config
. This instance is used inManuscriptEditor.revise_manuscript()
to resolve a prompt for each filename in the manuscript.The resolved prompt is passed down all the way to each models'
get_prompt()
method. InGPT3CompletionModel.get_prompt()
, after checking for a custom prompt, the resolved prompt is then used. If the resolved prompt is falsey, then prompt resolution occurs as it did before (i.e., the section name is inferred and used to find a prompt and, failing that, a default prompt is used.)History of changes
and keyword, but code outside of manubot-ai-editor wasn't supplying it, leading
to an exception. As of commit 3ae73de, a default title and keyword
list is now provided, which should resolve the exception.
but no GitHub/PyPI release as of yet.
Testing Procedure
Here, I describe the live testing procedure using this feature through GitHub Actions and the
ai-revision
workflow.This is the typical use case for the Manubot AI Editor.
Setting up the Manubot AI Editor
These steps create the conda environment, install Manubot, and override the Manubot AI Editor package using this branch's version.
gh pr checkout 37 conda env create -f environment.yml -n manubot-ai-editor-testing conda activate manubot-ai-editor-testing pip install --upgrade manubot[ai-rev] pip install -e .
Export your OpenAI API key:
Unit tests
Run all the unit tests:
All unit tests should have passed.
Manuscripts used for testing
I've tested this new version of the Manubot AI Editor using the original three manuscripts we used in the preprint.
I forked the original manuscript repositories here:
For each of these repositories, I've created different branches that reflect the different test cases present in folder
tests/config_loader_fixtures
:pr37_test-both_prompts_config
pr37_test-conflicting_promptsfiles_matchings
pr37_test-only_revision_prompts
pr37_test-single_generic_prompt
(proofreading prompt)pr37_test-prompts_are_used
(checks that prompts are used in each section; similar totest_prompt_config.py/test_prompts_apply_gpt3
which adds a sentinel value to the paragraph)For convenience, I put here the git command to clone each manuscript repo:
Make sure also that there is a repository secret named
OPENAI_API_KEY
with your API key.Local tests: DebuggingManuscriptRevisionModel
Before hitting the OpenAI's API, I ran a local revision model (
DebuggingManuscriptRevisionModel
) to ensure the right parameters, such as the prompt, is being used for each paragraph.First, clone one of the manuscripts above, such as PhenoPLIER, and checkout one of the test cases branches.
Run using a local model for debugging:
Then, I checked the following:
{section_name}
,{title}
andkeywords
are correctly replaced.GitHub Actions tests (uses GPT3CompletionModel)
Setting up workflow
Make sure each manuscript's
ai-revision
workflow is installing themanubot-ai-editor
version to be tested.For this, open
.github/workflows/ai-revision.yaml
and make sure the last line here is present:Kicking off workflow
For the live tests using the OpenAI models:
ai-revision
workflow.Run workflow
combobox, and enter this information in each field:main
<BRANCH_TO_REVISE>
(such aspr37_test-both_prompts_config
)ai-revision-<BRANCH_TO_REVISE>
Run workflow
and wait for it to finish.Assertions
ai-revision
workflow run and then onAI Revise
:Install Manubot AI revision dependencies
, make suremanubot-ai-editor-0.5.0
was installed.