-
-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Framework for adding context to LLM prompt #993
Conversation
38ac5bb
to
f03c5af
Compare
bb4b37f
to
723d7ba
Compare
8437cff
to
3080676
Compare
008a94c
to
2cfb3ad
Compare
@michaelchia Thank you for working on such a significant potential addition to Jupyter AI! Given the size and scope of this PR, it will take more time for us to review this and determine if this user experience is aligned with our longer-term vision for Jupyter AI. We appreciate your patience in the meantime. 🙏 I've rebased your PR for review purposes. |
@michaelchia Wow, just watched the demo videos, and this feature is mind-blowingly awesome! 🎉 🎉 I love how easy it is to just point Jupyternaut at a file and allow its context to be used to answer your query. Only two callouts regarding the UX:
|
@michaelchia wow, this is really amazing! Thanks for putting all the time into this, can't wait to try it out. |
@dlqqq really appreciate your consideration for this PR. Take your time with it. There are definitely tons of details to iron out before it is ready.
On a side note on the multi-user features, for my usecase, I do not see us requiring or using such a feature so I hope that it won't add too many changes or other limitations since it is primarily an AI assistant tool. This is just my personal opinion and don't mean to speak for other users that I am sure would find such a thing useful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaelchia Hey Michael, I've reviewed as much as I can today. I'm leaving a partial code review here, so you can address my feedback as I review the rest.
Overall, the code looks excellent! I'm impressed by how much work you've put into this. Left some feedback for you below.
for more information, see https://pre-commit.ci
* Run mypy on CI * Rename, add mypy to test deps * Fix typing jupyter-ai codebase (mostly) * Three more cases * update deepmerge version specifier --------- Co-authored-by: David L. Qiu <[email protected]>
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Co-authored-by: david qiu <[email protected]>
7b7831a
to
83197a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaelchia Hey Michael, I've responded to your comments above and resolved conversations as needed. I left some more feedback below for you. I'm about 70% done with this review, but still need more time to test & get feedback from other contributors.
I sincerely appreciate your patience in the meantime! 🤗
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaelchia Hey Michael, this PR looks great after your recent revisions. I've checked in with other members on the team, who all agree that this would be wonderful to include in the next 2.x release. I've also tested this branch locally and verified that it works well. Thank you for your hard work and patience! 🤗
ℹ️ I've left one last point of feedback below suggesting a safer way to ensure context providers are only triggered if the user explicitly typed a command to run them.
ℹ️ After deleting _examples.py
and addressing that last point of feedback, I will approve and merge this PR.
There was only one usability issue that I noticed: the @
autocomplete options are still shown even when a slash command is used. However, context is only included when using DefaultChatHandler
. I think it is fine for this issue to be addressed in a future PR, given that this feature is still early-stage and in-development.
Again, thank you for the outstanding effort you've invested into Jupyter AI thus far!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaelchia Thank you! This will be included in a minor release tomorrow. 🎉
* context provider * split base and base command context providers + replacing prompt * comment * only replace prompt if context variable in template * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Run mypy on CI, fix or ignore typing issues (jupyterlab#987) * Run mypy on CI * Rename, add mypy to test deps * Fix typing jupyter-ai codebase (mostly) * Three more cases * update deepmerge version specifier --------- Co-authored-by: David L. Qiu <[email protected]> * context provider * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * mypy * black * modify backtick logic * allow for spaces in filepath * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactor * fixes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix test * refactor autocomplete to remove hardcoded '/' and '@' prefix * modify context prompt template Co-authored-by: david qiu <[email protected]> * refactor * docstrings + refactor * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * mypy * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add context providers to help * remove _examples.py and remove @learned from defaults * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * make find_commands unoverridable --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Michał Krassowski <[email protected]> Co-authored-by: David L. Qiu <[email protected]>
Description
Aims to solve the first task of #910.
Extendable framework for adding context to prompts. Allows users to define
BaseContextProvider
s that are responsible for taking in user chat input and retrieving all relevant information that should be injected into the LLM prompt as context.BaseContextProvider
has an abstract methodmake_context_prompt
to be implemented, that takes in theHumanChatMessage
and returns a string that should be added to the context. TheBaseChatHandler
will have a new methodmake_context_prompt
that would loop over all theBaseContextProvider
s to append all the context prompts together. TheDefaultChatHandler
will pass the context to the prompt template that has a new optional {context} placeholder. The default provider prompt template has been modified to include this optional {context} placeholder.The subclass
BaseCommandContextProvider
can be triggered via a command with the@
prefix with auto-completion suggestions in the chat input UI, similar to slash commands. The difference is that it does not need to be at the start of the input; can have multiple instances in the input; and may have a single argument that is part of the command. E.g. "tell me what are the differences in@file:file1.py
and@file.file2.py
.BaseCommandContextProvider
can optionally modify or remove the commands from the user prompt before passing the LLM inDefaultChatHandler
. This is to clean up the prompt to make it more understandable by the LLM that might not know what the command means.Two default
BaseCommandContextProvider
were implemented.FileContextProvider
that allows you to add the contents of a file to the context by calling@file:<filepath>
. The filepath uses the same base path config logic and supported extensions as/learn
. It also allows for filepath auto-complete suggestions (see demo).LearnedContextProvider
(not final name. see discussion point below) to be triggered with@learn
as a replacement for/ask
. It calls the same retriever and adds the snippets to the context. It is more flexible than/ask
in that it can be used with other context providers. However, it is not obvious that it needs to be added as a replacement for/ask
. I mostly implemented it as a concrete example of a retriever-based context provider. I'll have no issue removing it if needed.Demo
FileContextProvider
Screen.Recording.2024-09-12.at.11.24.54.PM.mov
LearnedContextProvider
Screen.Recording.2024-09-12.at.11.30.00.PM.mov
Usage
For users who want to develop their own context provider:
For context provider without a command that is triggered always or through some other form of inference/config, subclass
BaseContextProvider
and implementmake_context_prompt
.To include a command, subclass
BaseCommandContextProvider
and similarly implementmake_context_prompt
. Use the_find_instances(text: str)
method to get all instances of the command to detect presence of command or generate and concat the context for each instance. For commands with an argument, you can optionally implement aget_arg_options(prefix: str)
method to return the list of auto-complete options for the argument.Examples
I added some examples in the 'jupyter_ai/context_providers/_examples.py'. This is to be removed before merging and I don't think any of these should be added by default.
They are meant to illustrate:
Some ideas for context providers are:
@var:<variable name>
to add variables from an active notebook kernel into the contexthuman_msg
.@df:<var name>
for adding pandas and pyspark dataframe schemasErrorContextProvider
that checks if the error is contained in thehuman_msg
and uses that to query an internal database of errors and solutions for internal packages and error messages.SelectionContextProvider
that adds the selection to the context section instead.Implementation summary
Refactored auto-completion logic in 'chat-inputs.tsx'
slashCommandOptions
stuff with a more generalautocompleteOptions
logic that also supports the autocomplete logic for context providers or any other future autocompletion types.SlashCommandsInfoHandler
with anAutocompleteOptionsHandler
to provide the autocomplete options. (SlashCommandsInfoHandler
was not removed yet as I do not know if it is intended to be used elsewhere in the future)Added a
jai_context_providers
to the extension'sself.settings
and as a param toBaseChatHandler
, similar to thejai_chat_handlers
.Modified
DefaultChatHandler
with a step to make the context prompt and to replace/clean up the prompt as mentioned above.Modified the default prompt templates in juypter-ai-magics to include the {context} placeholder.
Points of discussion
ContextProvider
. @3coins was suggestingKnowledgeBase
but I feel that Context is more generally applicable and conventional for this.SlashCommandsInfoHandler
and related code if not planned on being used elsewhere.LearnedContextProvider
should be added.LearnedContextProvider
if to be added.Future enhancements
Let me know what you guys think and if there is anything you would like me to change.