Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design proposal: Chat Completions API (rev. 0) #143

Closed
dlqqq opened this issue Jan 3, 2025 · 4 comments
Closed

Design proposal: Chat Completions API (rev. 0) #143

dlqqq opened this issue Jan 3, 2025 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@dlqqq
Copy link
Member

dlqqq commented Jan 3, 2025

Description

This issue proposes a design for a new Chat Completions API. This API will allow consumer extensions to provide completions for the user's current input from the UI. In this context, a consumer extension is any frontend + server extension that intends to provide completions for substrings in the chat input.

Motivation

Suppose a user types / in the chat with Jupyter AI installed. Today, Jupyter Chat responds by showing a menu of chat completions:

Screenshot 2025-01-02 at 5 08 27 PM

The opening of this completions menu is triggered simply by typing /. However, because the current API only allows a single "trigger character", this doesn't work when @ is typed, meaning that @file commands cannot be auto-completed.

This design aims to:

  1. Extend the existing chat completions capability to allow for completions to be triggered on multiple triggering patterns.
  2. Allow triggering patterns to be more complex than the existence of a single character.

To help explain the proposed design, this document will start from the perspective of a consumer extension, then work backwards towards the necessary changes in Jupyter Chat.

Step 1: Defining & providing a ChatCompleter

To register completions for partial inputs, a consumer extension must provide a set of chat completers. A chat completer is a Python class which provides:

  • id (property): Defines a unique ID for this chat completer. We will see why this is useful later.

  • regex (property): Defines a regex which matches any incomplete input.

    • Each regex should end with $ to ensure this regex only matches partial inputs just typed by the user. Without $, the completer may generate completions for commands which were already typed.
  • get_completions(match: str) -> List[ChatCompletion] Defines a method which accepts a substring matched by its regex, and returns a list of potential completions for that input. This list may be empty.

    • The interface of ChatCompletion will be defined later; for now, we can think of this method as just returning a list of strings that are potential completions to the user's input.

Jupyter Chat will provide an AbstractChatCompleter class that defines the structure of the chat completer class, shown below.

from abc import ABC

class AbstractChatCompleter(ABC):
    @property
    @abstractmethod
    def id(self):
        raise NotImplementedError()
       
    @property
    @abstractmethod
    def regex(self):
        raise NotImplementedError()
    
    @abstractmethod
    def get_completions(self, match: str) -> List[ChatCompletion]
        raise NotImplementedError()

To define a chat completer, a consumer extension should implement the AbstractChatCompleter class. Here is an example of how Jupyter AI may implement a chat completer to provide completions for its slash commands:

class SlashCommandCompleter(AbstractChatCompleter):
    @property
    def id(self):
        return "jai-slash-commands"
    
    @property
    def regex(self):
        # matches when:
        # - any partial slash command appears at start of input
        # - the partial slash command is immediately followed by end of input
        #
        # Examples:
        # - "/" => matched
        # - "/le" => matched
        # - "/learn" => matched
        # - "/learn " (note the space) => not matched
        # - "what does /help do?" => not matched
        return "/^\/\w*$/"
    
    def get_completions(self, match: str) -> List[ChatCompletion]:
        # should behave like:
        # "/" => ["/ask", "/help", "/learn", ...]
        # "/l" => ["/learn"]
        # "/h" => ["/help"]
        # "/zxcv" => []
        ...

Finally, for a consumer extension to provide these chat completers to Jupyter Chat, the consumer extension must declare each class as an entry point in a fixed entry point group. When Jupyter Chat reads from this entry point group on init, Jupyter Chat can gather all chat completer classes from every consumer extension.

Entry points are defined in PyPA entry points specification. Entry points are used already in Jupyter AI to allow other extensions to add extra chat commands. We will not go into detail here, as Jupyter AI already serves as an implementation reference.

Step 2: Define the chat completion REST API

From the example SlashCommandCompleter implementation above, we can piece together what Jupyter Chat's frontend should do:

  1. On init, fetch the chat completer IDs & regexes from the backend.
  2. When any completer's regex is matched by the user's input:
    • Fetch a list of all valid completions from every chat completer whose regex is matched by the user's input, from the backend.
    • Show the list of all completions in the UI.
    • When a completion is accepted, replace the substring of the input matched by the completer's regex with the completion.

The frontend implementation should debounce how frequently it checks the input for step 2, since it will be expensive to do on every typed character.

To make this possible, we need to define a new REST API for chat completion.

Completions REST API

  • GET /chat/completers: Returns a ChatCompletersResponse object, which describes all of the chat completers provided to Jupyter Chat by consumer extensions.

  • POST /chat/completion_matches: Returns a ChatCompletionsResponse object given a ChatCompletionsRequest object in the request body. This fetches the list of completions from any input. This should be triggered when the user's chat input matches any of the regexes from GET /chat/completers.

Request & response types

type ChatCompletersResponse = {
    completers: ChatCompleter[];
}

type ChatCompleter = {
    id: string;
    regex: string;
}

type ChatCompletionsRequest = {
    matches: ChatCompleterMatch;
}

type ChatCompleterMatch = {
    completerId: string;
    match: string;
}

type ChatCompletionsResponse = {
    completions: ChatCompletion[]
}

type ChatCompletion = {
    // e.g. "/ask"
    value: string;
    
    // if set, use this as the label. otherwise use `value`.
    label?: string;
    
    // if set, show this as a subtitle.
    description?: string;
    
    // identifies which icon should be used.
    // not described here, so consider this field reserved for now.
    iconType?: string;
}

Example request flow

In this section, we will explore the REST API calls made in an example setting. This assumes that this design has been implemented exactly as stated, and that SlashCommandCompleter has been provided by another consumer extension.

When a user opens JupyterLab, the frontend immediately calls GET /chat/completers to fetch the list of completers & their regexes. With just one completer provided, the ChatCompletersResponse object is:

{
    "completers": [
        { "id": "jai-slash-commands", "regex": "/^\/\w*$/" }
    ]
}

Then, the user types /h. This matches the regex of SlashCommandCompleter, so the frontend calls POST /chat/completion_matches with a ChatCompletionsRequest object:

{
    "matches": [
        { "completerId": "jai-slash-commands", "match": "/h" }
    ]
}

The backend receives this request and responds with a ChatCompletionsResponse object. Here, we assume that /help is the only valid completion.

{
    "completions": [
        {
            "value": "/help ", // <= adds a space after accepting completion
            "label": "/help"
            "description": "Display a help message (Jupyter AI).",
            "iconType": "book",
        }
    ]
}

The user's menu now has a single completion for /h, which replaces /h with /help when accepted.

Conclusion

Together, the entry points API (step 1) and the REST API (step 2) form the proposed Chat Completion API.

Benefits & applications

  • Completers are uniquely identified by their id, so two completers can use the same regex but yield two different sets of completions.
    • Application: Another extension could use the same / command regex to provide completions for its own custom / commands.
    • Application: Typing @ can trigger completions from multiple completers; one may provide usernames of other users in the chat, and another may provide the context commands available in Jupyter AI (e.g. @file).
  • A completion doesn't need to share a prefix with the substring that triggered completions.
    • Application: Define a completer that matches $ and returns the completion \\$. Pressing "Enter" to accept the completion allows a user to easily type a literal dollar sign instead of opening math mode. If typing math was the user's intention, typing any character other than "Enter" hides the \\$ completion and allows math to be written.
  • Regex allows the triggering of completions to be strictly controlled. This means that "complete-able" suffixes don't need some unique identifier like / or @.
    • Application: Define a completer that matches ./ following whitespace and returns filenames for the current directory. For example, this could trigger the completions ./README.md, ./pyproject.toml, etc.

    • Application: Define a completer that matches : following whitespace and returns a list of emojis.

Risks considered

  • This design proposes that the completer classes are defined in the backend. This may be a concern as some data & state is more easily accessed from the frontend.

    • I can change this such that completer classes are defined in the frontend. The get_completions() method can be made async such that some completers can make a network call to use backend APIs, but others can use frontend APIs directly.
    • One issue with defining completers in the frontend is that I'm not sure if it will allow multiple (>1) extensions to provide completers. As far as I know, at most one extension can provide a Lumino token.
  • I'm not sure if the current design will be sufficient for the @-mentioning of kernel local variables. This is a proposal for Jupyter AI v3.

If a major revision of this design is needed, I will close this issue, revise the design, and open a new issue with a bumped revision number.

@dlqqq dlqqq added the enhancement New feature or request label Jan 3, 2025
@dlqqq dlqqq self-assigned this Jan 3, 2025
@dlqqq
Copy link
Member Author

dlqqq commented Jan 3, 2025

cc @michaelchia

@dlqqq
Copy link
Member Author

dlqqq commented Jan 3, 2025

@mlucool shared some feedback about this to me privately. To summarize Marc's feedback: the completers should live in the frontend, and get_completions() should be made async.

This proposed revision brings several benefits:

  • Completers can be optimized to minimize network calls. Some completions can be cached as the list of completions is immutable at runtime; this is true for / commands in Jupyter AI. These completers can fetch the list of completions from the server at init, and not require a network call as the user types. Furthermore, some completions can be defined statically in the frontend, e.g. emoji completions after typing :.

  • The completer can choose to use any API to communicate with the server. This would allow for completers to use REST and WS APIs provided by Jupyter Server and its extensions. Under this proposed revision, Python APIs can also be called by completers by defining a custom server handler for completer requests. Therefore, this change would strictly increase the scope of server APIs available to a completer.

  • The current implementation of @ mentioning local variables from Jupyter AI requires a JavaScript API; it's not known if Jupyter provides an equivalent Python API. This proposed revision would allow us to get a working demo of this feature to users more quickly.

In summary, this change would strictly decrease the number of network calls made, and strictly increase the scope of server APIs available to each completer. This seems quite reasonable, so I will work on a new revision of this design and open it as a separate issue.

@mlucool
Copy link

mlucool commented Jan 4, 2025

Completers can be optimized to minimize network calls.

While that is true, my main point is to design it in a way that puts the UX, which includes giving the front end ways to improve rending performance. In part, because the server may be blocked at any point in time by something else happening.

Moving this also lets you have control over the UX of the completer. I think a file completer would want a different experience than an variable one.

As an example, for the @var completer, we envisioned users could click on the variable and interact with it. For example, maybe it lets the user have a preview of what will be sent or maybe it lets the user specify some parameters (e.g. you want the verbose mode of a specific variable). While these are only half-formed ideas, it's good to not restrict.

The @file: is a bit more natural to think about wanting a specific UX. There may be a file chooser so you can quickly multi-select for example in a future version.

@dlqqq
Copy link
Member Author

dlqqq commented Jan 4, 2025

@mlucool Thanks for clarifying! Apologies for missing those details. I've added your feedback in the next revision of this design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants