Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamped Jupyter AI #1157

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

govinda18
Copy link

Revamped Jupyter AI

Hi team,

We at D.E. Shaw have forked Jupyter AI and made a significant number of changes to greatly enhance the user experience. Our goal was to align Jupyter AI with the capabilities offered by leading AI-based IDEs such as Cursor, Copilot etc. This pull request includes the majority of the changes we implemented.

This proposal is intended to outline these changes and may require some deliberation. We anticipate that this PR will need to be broken down into several smaller PRs, with some design changes to ensure smooth integration. Additionally, we are planning further enhancements, including per notebook chat and inline code generation, details of which are included in this proposal.

The demo showcases the powerful capabilities of Jupyter AI by incorporating notebook context, leveraging the kernel, and enhancing keyboard accessibility (I did not use my mouse at all while creating this).

jai_demo.mp4

Point of contact: @govinda18 @mlucool

Philosophy

Unlike in JupyterAI, IDEs like Cursor or Copilot have gone towards anything in the IDE is fair to be used. We believe this is the correct user experience for sharing context with the LLM. For Jupyter AI to be effective, it should always provide context automatically, further allowing quick code iterations through insertions and replacements.

We also want to leverage the one major difference that sets notebooks and JupyterLab apart from any other IDE - a runtime kernel. This opens up possibilities where you can include any context of any variable declared in your notebook and further inspect methods and objects to understand their usage. Checkout jupyter/enhancement-proposals#128 for our pre-proposal on streamlining the same.

For autocompletion to be helpful, it needs to be aware of the context in previous cells to provide a Copilot-like experience. For both chat and inline completion, we automatically let it send the entire context of the notebook. We further optimize for context window of the LLM. Checkout the current limitations section for more details.

Lastly, we felt that the chat interface feels a little distant and difficult to access. Most power users of JupyterLab would not use a mouse while working with a notebook, but it's currently not easy to use Jupyter AI without mouse intervention.

Overall, we want to bring the chat and the notebook close enough such that the LLM intelligently understands what the user is talking about without the need to manually decide what context needs to be added. This should further integrate into the natural workflow of the user by providing inline code generation and keyboard accessibility.

What has changed?

This proposal aims to make Jupyter AI much more powerful and accessible, bringing a host of new features designed to enhance user productivity and streamline their data science workflows.

Major Features

  • Inline Code Completion: The Jupyter AI inline completer is now context-aware, utilizing code from both preceding and succeeding cells to provide more accurate suggestions.

    Currently, changes are sent from the frontend (reference), but we believe that jupyter_ydoc should handle this more effectively. We encountered some issues related to jupyter-collaboration issue #202 and did not want to wait for it.

  • Notebook Aware Chat: The chat model is now aware of your active notebook, current cell, and text selection. You can directly ask it to perform actions like "Refactor this code" and it will refactor your notebook or active cell accordingly, saving you valuable time. This was also proposed in Add all notebook to context,  #1037.

    Again, while currently this is being sent from the frontend, we believe that jupyter_ydoc should handle this more effectively. Refer this prompt to understand more.

    Further, we have added the ability to modify prompts directly from the Jupyter AI settings, enabling quicker iterations. Users can also view the exact prompt sent to the LLM by using the View Prompt call-to-action (CTA) in the chat interface.

  • Keyboard Accessibility: Navigate Jupyter AI with ease using keyboard shortcuts. Use Ctrl + Shift + K to enter chat mode (this is inline with the eventual Ctrl + K for inline code generation) and Escape to exit(ref). You can easily add generated code to your notebook by navigating with Shift + Tab.

    For enhanced user experience, we have

    • Changed the order of the cell toolbar to be Copy, Replace, Add before, Add after as we anticipate most people would want to add the code after the active cell, so it's just a Shift Tab away. (ref)
    • Goes back to the notebook directly after any action which would be the typical next step. (ref)
  • Variable Insertion: Make the Large Language Model (LLM) aware of your variables effortlessly by inserting them into the chat using the @ symbol. This helps the model better understands your context and provides more accurate suggestions by asking the kernel for runtime information. We felt this should work directly without using a prefix like @var:var_name as it is one of the most powerful benefits of having a live kernel.

    This functionality is driven by a variable description registry that knows how to describe any object to an LLM. Additionally, we are working on drafting a proposal to IPython for a generic __llm__ method that can be used by any Python class to describe itself to an LLM. More details in Pre-proposal: standardize object representations for ai and a protocol to retrieve them jupyter/enhancement-proposals#128.

    This also enhances user experience by providing an autocomplete of the variables declared in the currently active notebook. (ref)

    The current implementation is also pluggable as documented here.

Dropped Features

We suppressed some of the existing features as well. There are different reasons though but we feel each of these features needs to be given a bit more thought before we can allow them generically.

Note that these features still work but are not shown in the autocomplete (ref) or help message (ref).

  • Support for @file: We believe that adding an entire file in a context needs to be given a bit more thought. Most users would want to add the current notebook in the context which we now do by default. For adding different types of files, there should be some pre-processing. For example, adding a python file would be siginficiantly different that how a csv should be added.
  • Support for /learn and /ask: There are some fundamental issues with the current implementation:
    • The embedding generation occurs in the main thread (last we checked, not sure if it has changed now) thereby blocking the kernel. We often found users shooting themselves in the foot while trying to make the LLM learn a big folder.
    • Embedding generation of notebooks would not work great as semantic search on code fails more often than not. A better approach may be to use an LLM based text description of code in case of notebooks.
    • /learn and /ask don't scale to a large number of files as accuracy greatly decreases as you add more embeddings. This is just a problem with semantic search in general. Things like hybrid search, reranking and metadata filtering should be researched and added. We found for our internal data that it was too easy to add too much for it to be accurate (adding smaller amount of data was useful but not very practical).
  • /fix should be replaced with an eventual Fix with AI button that we plan to have whenever an error occurs in the notebook.

Enhancements

  • Support for resetting the jupyter AI config from the UI (ref)
  • Support for settings the default completion model from the config (ref)
  • Change the delete icon for human message to be a menu similar to AI message for scalability with support to copy the prompt sent by the user (ref)
  • Updated the default prompts for chat input (ref)

Current Limitations

  • To avoid exceeding the context window when including the entire notebook, we have introduced an abstraction called process_notebook_for_context. Individual providers can implement a method like the one below to optimize the context window:

    View Code
      def process_notebook_for_context(model_id: str, code_cells: list[str], active_cell: int | None) -> str:
      """
      Processes the notebook to prepare context-aware code for LLM-based completion.
      
      This method respects the token limit for the notebook context by strategically selecting 
      code from surrounding cells (both prefix and suffix) to ensure the current active cell 
      has the most relevant context.
      
      Steps:
      1. The current active cell is taken entirely as the initial context.
      2. Tokens are allocated for suffix cells and added until the token limit is reached.
      3. Remaining tokens are allocated to prefix cells and added similarly.
      4. Any remaining tokens are used to extend the suffix further if the prefix is fully utilized.
      5. Comments are added to indicate the number of cells hidden above and below the context.
      
      Parameters:
      - model_id (str): The identifier for the LLM model being used.
      - code_cells (list[str]): The list of code cells in the notebook.
      - active_cell (int | None): The index of the currently active cell in the notebook. 
                                  Defaults to 0 if not provided.
      
      Returns:
      - str: The context-aware code string, including the active cell, prefix, and suffix 
             code cells, along with comments indicating hidden cells.
      """
      active_cell = active_cell or 0
      code_context = code_cells[active_cell]
      prefix_idx = active_cell - 1
      suffix_idx = active_cell + 1
    
      total_tokens = get_max_token(model_id) * MAX_NOTEBOOK_TOKENS_PCT
    
      model_for_counting = model_id
      rem_tokens = total_tokens - get_token_count_by_model(
          code_context, model_for_counting
      )
    
      max_suffix_tokens = int(total_tokens * PREFIX_SUFFIX_RATIO)
      suffix_code: list[str] = []
      while suffix_idx < len(code_cells):
          token_to_be_used = get_token_count_by_model(
              code_cells[suffix_idx], model_for_counting
          )
          if max_suffix_tokens - token_to_be_used < 0:
              break
    
          max_suffix_tokens -= token_to_be_used
          suffix_code.append(code_cells[suffix_idx])
          suffix_idx += 1
    
      rem_tokens += max_suffix_tokens
    
      prefix_code: list[str] = []
      while prefix_idx > 0:
          token_to_be_used = get_token_count_by_model(
              code_cells[prefix_idx], model_for_counting
          )
          if rem_tokens - token_to_be_used < 0:
              break
    
          rem_tokens -= token_to_be_used
          prefix_code.append(code_cells[prefix_idx])
          prefix_idx -= 1
    
      # If any tokens are remaining, add them to the suffix
      while rem_tokens > 0 and suffix_idx < len(code_cells):
          token_to_be_used = get_token_count_by_model(
              code_cells[suffix_idx], model_for_counting
          )
          if rem_tokens - token_to_be_used < 0:
              break
    
          rem_tokens -= token_to_be_used
          suffix_code.append(code_cells[suffix_idx])
          suffix_idx += 1
    
      if prefix_idx != -1:
          prefix_code.append(
              f"# Hiding {prefix_idx + 1} more cells above the context provided."
          )
    
      if suffix_idx != len(code_cells):
          suffix_code.append(
              f"# Hiding {len(code_cells) - suffix_idx} more cells below the context provided."
          )
    
      return "\n\n".join(prefix_code[::-1] + [code_context] + suffix_code)
    
  • This PR has only been tested for the models that stream but it should be easy to extend support for those who do not as well.

  • Variable Context is limited to basic data types and pandas only. While we default to __str__ for any object, we further aim to expand support for these.

Other ideas we are working on

  • Inline Code Generation: Inspired by Cursor, we are developing an inline code generation feature in Jupyter AI. Although still in development, here is a GIF demonstrating our vision for this feature (note that some major UI changes are still pending):
    jai-2

  • Per Notebook Chat: Currently, the chat is shared across notebooks, which creates a poor user experience as chat history becomes irrelevant when switching notebooks. We are working on adding support for maintaining a per-notebook chat to address this issue.

  • Fix with AI Button: Each error in JupyterLab will have a "Fix with AI" button, which essentially implements an inline version of the /fix command, making it easier for users to resolve errors directly within their workflow.

@mlucool
Copy link

mlucool commented Dec 18, 2024

One more thing: because we call into the kernel, the subshell JEP would benefit this implementation as well, otherwise things like autocomplete and getting LLM descriptions can be blocked

@dlqqq
Copy link
Member

dlqqq commented Dec 18, 2024

Thank you @govinda18 and @mlucool for contributing to the future of this project by proposing these big ideas! I really appreciate the immense amount of effort put into the design & implementation of the changes. ❤️

These are all very sound ideas. In fact, I believe we have existing issues for each of them. Here is a quick summary for others in this thread:

  • Improve inline completions w/ more context
  • Automatic awareness of open notebooks
  • Keyboard accessibility
  • Local variable inclusion in chat input
  • (TBD) Improving @file to handle large files
  • (TBD) Improve /learn and /ask
  • (TBD) Dropping /fix while improving the UX for fixing bugs
  • (TBD) Add a per-notebook chat

Currently, our team has shifted focus away from Jupyter AI v2 to focus on building the next major release, Jupyter AI v3, which is planned for mid-Feb 2025. We are developing this on the v3-dev branch in the same repo. Because v2 & v3 already deviate significantly, it would likely be impractical to port any large changes from v2 => v3. Therefore these proposals would likely be implemented as v3-exclusive features.

Given this, I'll share some context on v3 and provide some suggestions on how we can move forward on this 🤗.

Jupyter AI v3

In v3, all of the logic for managing chat state in the backend & rendering chats in the frontend has been migrated to a project called jupyter-chat. @brichet from QuantStack (a contracting firm) has led development on this. This migration helps to separate concerns and allow multiple extensions to hook into the same chat.

jupyter-chat mostly borrows the frontend of jupyter-ai, but deviates significantly in the backend implementation. jupyter-chat uses a Yjs shared document to model the chat state. jupyter-collaboration, a dependency of jupyter-chat, automatically syncs this shared chat across all clients and persists the chats as *.chat files. This is the same package which provides RTC capabilities in JupyterLab.

  • Note: The v3-dev branch already supports multiple chats in the same session. I will start working with others to explore if these files can be "tied" to a notebook to provide the "per-notebook chat" idea proposed here.

Suggestions on moving forward

Out of all the fantastic ideas proposed here, I think local variable inclusion (e.g. @foobar) is the most valuable and least ambiguous. Having this would be a killer feature for Jupyter AI, without question. jupyter-chat also already provides an input suggestion API that reads Jupyter AI's slash commands and shows them when / is pressed. Question: Does local variable inclusion sound like a reasonable first step?

v3 is still early in development, so jupyter-chat will require many changes if we want to see these ideas implemented in v3. To plan this effort, it would be helpful to know how much time is available from each of our teams. Question: How much commitment can you dedicate towards developing these capabilities?

  • On our side, I am the only person working full-time on Jupyter AI, while @brichet is the only person working full-time on Jupyter Chat.

Finally, it may be helpful to establish a dedicated communication channel for technical discussion as needed. Question: Do you all want a comms channel? AWS has a Slack workspace that allows for external connections; I can explore this if interested.

@krassowski
Copy link
Member

Finally, it may be helpful to establish a dedicated communication channel for technical discussion as needed.

I don't know if AWS team had a chance to explore the Jupyter Zulip channel but a lot of dev chatter around Jupyter nowadays happens over there (and it is still public).

@michaelchia
Copy link
Collaborator

@dlqqq please include me if there would be a comms channel on this. I've also been running a hacky patched version of jupyter-ai that has a very similar automatic context feature for the active file and info of variables via the active kernel. I would like to see how will be enabled in v3. I would primarily have interest in how extendable and configurable it would be for developers.

For example, in my version, I only extract dataframe schemas from variables and would not include output of cells in the context to mitigate risk of sending sensitive information to the models. I would hope that I would be able to configure it to do something like that or at minimum have the possibility of monkey patching it to do so.

@brichet
Copy link
Contributor

brichet commented Dec 19, 2024

Awareness of the current Notebook and interaction with the associated kernel would be a really nice feature.

FYI I opened jupyterlab/jupyter-chat#128 to track this in jupyter-chat.

@dlqqq
Copy link
Member

dlqqq commented Dec 19, 2024

@michaelchia Absolutely! As @krassowski mentioned, Zulip could be the tool for this. That way, it's accessible to people who aren't able to use Slack. I'll explore this in parallel now, since it'd be helpful to have a jupyter-ai channel there regardless of what we decide to use.

@Zsailer
Copy link
Member

Zsailer commented Dec 20, 2024

This is great work!

I wonder if y'all saw my demo in last month's community call: https://youtu.be/ildFScV6mZQ?si=7Wa7JFkZXsUnHK0K&t=1348

There is a lot of similar UX I demoed that I'd love to coordinate/sort out with you all.

I've written some extensible plugins similar to yours:

  1. a cell diff UX and command to show diff
  2. a cell (input) footer bar
  3. a pending toolbar button (surprising annoying to build)
  4. commands to insert code in cell
  5. commands to request feedback
  6. integration with langgraph to make decisions depending on cell context.
  7. AI API driven by JupyterLab commands, so that LLMs can "speak" to the UI and tell it what to do more easily.

along with some other useful things like getting feedback from users.

I've been slowly working on shipping the building blocks as individual plugins:

I broke all of these things into separate repos (for now) so that they can be as extensible as possible and (possibly) role into Jupyter AI more easily. Maybe we can collaborate on these efforts?

@mlucool
Copy link

mlucool commented Dec 20, 2024

Sounds great @Zsailer we'll have a version of what we were thinking ready too. Some of the ideas you propose seem pretty great. Noting some differences from your demo that I think are important

  1. Conversational UI around cells that are overlaid. That is, we don't use the cell as the input box (in part so we can have multi-turn conversations and iteration, rather than one-shot)
  2. Deeper integration into jupyter AI to use / commands. The idea is they are very similar in features. One is for longer chats (jupyter-ai today) and the other is for iterating on a single cell in focus (maybe multi-cell in the future).
  3. Diffs are in the cell and by default what users see (similar to cursor). This is to make clear of any changes quickly. Our hope to be able to layer them as the user continues to add more follow ups. This makes it much easier to accept/reject and undo after testing and add a few small changes to try generated code and still being able to full undo what happened.

I'm working with @dlqqq to get a slack channel created (its our preferred real-time choice also), so we can invite you (and whomever else) to that room if we are able to get this setup

@Zsailer
Copy link
Member

Zsailer commented Dec 20, 2024

Conversational UI around cells that are overlaid. That is, we don't use the cell as the input box (in part so we can have multi-turn conversations and iteration, rather than one-shot)

I went back and forth on this, and dogfooded this quite a bit. I have some loose opinions here after using this stuff a lot, but could see pros/cons for both. You're totally right—you give up the multi-turn conversation if don't overlay, but man, typing in the cell is just a buttery smooth, simple experience 😅. There's something about just writing directly in a cell that I couldn't get over.

Deeper integration into jupyter AI to use / commands. The idea is they are very similar in features. One is for longer chats (jupyter-ai today) and the other is for iterating on a single cell in focus (maybe multi-cell in the future).

Totally. I was working on this as well. I've been focused on UX primarily. Integrating with Jupyter AI commands and tools (coming in #991) is something we absolutely need—I just didn't get there yet.

Diffs are in the cell and by default what users see (similar to cursor)

I also did this originally, but didn't like the experience. I felt like I was baby-sitting the AI. I didn't want to have to copy-edit the LLMs response by default every time. That's what you're imposing on the user by automatically showing an inline (possibly complex) diff. The user has to manually accept everything. Instead, I'd rather assume my AI collaborator is decent enough at its job, so I'd prefer to try its work and only peak at a diff when I'm skeptical. Perhaps I'm too trusting of the AI ;) but my long term thinking here is that the AI is going to become a pretty good collaborator going forward. I don't want the AI to create "homework" for me (i.e. reviewing their work everytime) when using it. I recognize, however, I might be in the minority here. haha

Our hope to be able to layer them as the user continues to add more follow ups

This is an interesting outcome, though, that might overturn my opinion in the last paragraph :)

accept/reject and undo after testing

One thing I should note... I added a undo + redo button to the cell toolbar. When dogfooding, undo/redo using the cell toolbar felt quite natural—the user assumes the AI is pretty decent, only shows diff if they're skeptical, and can flip to previous states using undo / redo buttons easily. This keeps the "purity" of the code cell in place, without layering on diffs and causing confusion around what the "true" state of the code cell is. The AI button, cell diffs, and undo/redo actions are ancillary activities to main code cell.

I'm working with @dlqqq to get a slack channel created (its our preferred real-time choice also), so we can invite you (and whomever else) to that room if we are able to get this setup

Yes, please add me :) I'm thrilled to see this stuff land. Would love to talk through these ideas more concretely; I've been

@dlqqq
Copy link
Member

dlqqq commented Dec 20, 2024

Great discussion here. @Zsailer I would love to sync later about these features you've built and how to get them in Jupyter AI in the future. I haven't been kept in-the-loop about your progress. Feel free to ping me on Slack if you have availability to chat in the next 2 weeks; otherwise I'll follow up on Jan 2 after folks have their holidays.

Also, I reached out to the Jupyter Media Strategy WG, and @andrii-i helped create a #jupyter-ai Zulip channel. Others, please feel free to have discussion there too!

@dlqqq
Copy link
Member

dlqqq commented Dec 20, 2024

@mlucool and I are still working on the Slack channel. Will keep folks posted.

@echarles
Copy link
Member

Posting here work we are doing in https://github.com/datalayer/jupyter-ai-agent

What is the best place to discuss: this repo, or the Zulip channels?

@echarles
Copy link
Member

AWS has a Slack workspace that allows for external connections; I can explore this if interested.

I missed that sentence. TBH I would prefer having those discussions on the github issues, instead of spreading them across github/zulip/slack. We will be sure to not miss something.

@echarles
Copy link
Member

Perhaps I'm too trusting of the AI ;) but my long term thinking here is that the AI is going to become a pretty good collaborator going forward.

@Zsailer As good as AI can be, users need to clearly be informed of the origin of the content. We are even considering giving the user the opportunity to run the generated code in a kernel clone before applying it to his current kernel. These are lessons learned from the ISO 42001 certification we are pursuing to ensure responsible AI usage and development.

@Zsailer
Copy link
Member

Zsailer commented Jan 4, 2025

As good as AI can be, users need to clearly be informed of the origin of the content.

Oh I agree! my plugin stores the history of AI changes in the cell metadata for that reason. It also shows a diff (collapsed by default) so you can easily tell what the LLM changed if you want to.

My point was more of a UX point. It's annoying to have to constantly "approve" a change when your LLM is good at its task. I begin to feel like a micromanager and the AI adds work to my plate rather than remove it.

So instead I preferred a UX where I can peak back at what the AI did when I feel it's necessary... instead of always requiring approval.

@Zsailer
Copy link
Member

Zsailer commented Jan 7, 2025

Just saw that @mbektas published another AI plugin here 🚀

Referencing here so we can start to work together and de-dupe some of this work. 😃

@Zsailer
Copy link
Member

Zsailer commented Jan 7, 2025

I would propose that we get all of the folks here together on a call and develop a plan to collaborate and de-duplicate some of this work. I think there so much good stuff here, it would be awesome to weave it all together.

Maybe we can target a JupyterLab call?

We also have a Zulip going here: https://jupyter.zulipchat.com/#narrow/channel/475130-jupyter-ai/topic/Welcome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants