Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add instructions about how to add new model in kaito #224

Merged
merged 3 commits into from
Feb 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,8 @@ $ kubectl run -it --rm --restart=Never curl --image=curlimages/curl -- curl -X P

The detailed usage for Kaito supported models can be found in [**HERE**](presets/README.md). In case users want to deploy their own containerized models, they can provide the pod template in the `inference` field of the workspace custom resource (please see [API definitions](api/v1alpha1/workspace_types.go) for details). The controller will create a deployment workload using all provisioned GPU nodes. Note that currently the controller does **NOT** handle automatic model upgrade. It only creates inference workloads based on the preset configurations if the workloads do not exist.

The number of the supported models in Kaito is growing! Please check [this](./docs/How-to-add-new-models.md) document to see how to add a new supported model.

## Contributing

[Read more](docs/contributing/readme.md)
Expand Down Expand Up @@ -258,6 +260,3 @@ This project has adopted the [Microsoft Open Source Code of Conduct](https://ope
## Contact

"Kaito devs" <[email protected]>



28 changes: 28 additions & 0 deletions docs/How-to-add-new-models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# New OSS model onboarding

This document describes how to add a new supported OSS model in Kaito. The process is designed to allow community users to initiate the request. Kaito maintainers will follow up and deal with managing the model images and guiding the code changes to set up the model preset configurations.

## Step 1: Make a proposal
Fei-Guo marked this conversation as resolved.
Show resolved Hide resolved

This step is done by the requestor. The requestor should make a PR to describe the target OSS model following this [template](./proposals/YYYYMMDD-model-template.md). The proposal status should be `provisional` in the beginning. Kaito maintainers will review the PR and decide to accept or reject the PR. The PR could be rejected if the target OSS model has low usage, or it has strict license limitations, or it is a relatively small model with limited capabilities.


## Step 2: Validate and test the model

This step is done by Kaito maintainers. Based on the information provided in the proposal, Kaito maintainers will download the model and test it using the specified runtime. The entire process is automated via GitHub actions when Kaito maintainers file a PR to add the model to the [supported\_models.yaml](../presets/models/supported_models.yaml).


## Step 3: Push model image to MCR

This step is done by Kaito maintainers. If the model license allows, Kaito maintainers will push the model image to MCR, making the image publicly available. This step is skipped if only private access is allowed for the model image. Once this step is done, Kaito maintainers will update the status of the proposal submitted in Step 1 to `ready to integrate`.

## Step 4: Add preset configurations

This step is done by the requestor. The requestor will work on a PR to register the model with preset configurations. The PR will contain code changes to implement a simple inference interface. [Here](../presets/models/falcon/model.go) is an existing example. In the same PR, or a separate PR, the status of the proposal status should be updated to `integrated`.

## Step 5: Add an E2E test

This step is done by the requestor. A new e2e test should be added to [here](../test/e2e/preset_test.go) which ensures the inference service is up and running with preset configurations.


After all the above are done, a new model becomes available in Kaito.
2 changes: 1 addition & 1 deletion docs/proposals/YYYYMMDD-model-template.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ reviewers:
- "Kaito contributor"
creation-date: yyyy-mm-dd
last-updated: yyyy-mm-dd
status: provisional|implemented|deferred|rejected|withdrawn
status: provisional|ready to integrate|integrated
---

# Title
Expand Down
Loading