Skip to content

Commit

Permalink
chore: revise README to include quick start (#123)
Browse files Browse the repository at this point in the history
Co-authored-by: guofei <[email protected]>
  • Loading branch information
Fei-Guo and Fei-Guo authored Nov 1, 2023
1 parent 9495158 commit 413de7a
Show file tree
Hide file tree
Showing 8 changed files with 45 additions and 17 deletions.
39 changes: 36 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,41 @@ helm uninstall workspace

## Quick start

TODO.
After installing Kaito, one can try following commands to start a faclon-7b inference service.
```
$ cat examples/kaito_workspace_falcon_7b.yaml
apiVersion: kaito.sh/v1alpha1
kind: Workspace
metadata:
name: workspace-falcon-7b
resource:
instanceType: "Standard_NC12s_v3"
labelSelector:
matchLabels:
apps: falcon-7b
inference:
preset:
name: "falcon-7b"
$ kubectl apply -f examples/kaito_workspace_falcon_7b.yaml
```
The workspace status can be tracked by running the following command.
```
$ kubectl get workspace workspace-falcon-7b
NAME INSTANCE RESOURCEREADY INFERENCEREADY WORKSPACEREADY AGE
workspace-falcon-7b Standard_NC12s_v3 True True True 10m
```
Once the workspace is ready, one can find the inference service's cluster ip and use a temporal `curl` pod to test the service endpoint in cluster.
```
$ kubectl get svc workspace-falcon-7b
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
workspace-falcon-7b ClusterIP <CLUSTERIP> <none> 80/TCP,29500/TCP 10m
$ kubectl run -it --rm --restart=Never curl --image=curlimages/curl sh
~ $ curl -X POST http://<CLUSTERIP>/chat -H "accept: application/json" -H "Content-Type: application/json" -d "{\"prompt\":\"YOUR QUESTION HERE\"}"
```

## Contributing

Expand All @@ -90,12 +123,12 @@ For more information see the [Code of Conduct FAQ](https://opensource.microsoft.
contact [[email protected]](mailto:[email protected]) with any additional questions or comments.

## Trademarks

<!-- markdown-link-check-disable -->
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft
trademarks or logos is subject to and must follow [Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
Any use of third-party trademarks or logos are subject to those third-party's policies.

<!-- markdown-link-check-enable -->
## License

See [LICENSE](LICENSE).
Expand Down
2 changes: 0 additions & 2 deletions examples/kaito_workspace_falcon_40b-instruct.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
apiVersion: kaito.sh/v1alpha1
kind: Workspace
metadata:
annotations:
kubernetes-kaito.sh/service-type: load-balancer
name: workspace-falcon-40b-instruct
resource:
instanceType: "Standard_NC96ads_A100_v4"
Expand Down
2 changes: 0 additions & 2 deletions examples/kaito_workspace_falcon_40b.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
apiVersion: kaito.sh/v1alpha1
kind: Workspace
metadata:
annotations:
kubernetes-kaito.sh/service-type: load-balancer
name: workspace-falcon-40b
resource:
instanceType: "Standard_NC96ads_A100_v4"
Expand Down
2 changes: 0 additions & 2 deletions examples/kaito_workspace_falcon_7b-instruct.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
apiVersion: kaito.sh/v1alpha1
kind: Workspace
metadata:
annotations:
kubernetes-kaito.sh/service-type: load-balancer
name: workspace-falcon-7b-instruct
resource:
instanceType: "Standard_NC12s_v3"
Expand Down
2 changes: 0 additions & 2 deletions examples/kaito_workspace_falcon_7b.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
apiVersion: kaito.sh/v1alpha1
kind: Workspace
metadata:
annotations:
kubernetes-kaito.sh/service-type: load-balancer
name: workspace-falcon-7b
resource:
instanceType: "Standard_NC12s_v3"
Expand Down
5 changes: 3 additions & 2 deletions examples/kaito_workspace_llama2_13b-chat.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
apiVersion: kaito.sh/v1alpha1
kind: Workspace
metadata:
annotations:
kubernetes-kaito.sh/service-type: load-balancer
name: workspace-llama-2-13b-chat
resource:
instanceType: "Standard_NC12s_v3"
Expand All @@ -12,3 +10,6 @@ resource:
inference:
preset:
name: "llama-2-13b-chat"
accessMode: private
presetOptions:
image: <YOUR IMAGE URL>
5 changes: 3 additions & 2 deletions examples/kaito_workspace_llama2_70b-chat.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
apiVersion: kaito.sh/v1alpha1
kind: Workspace
metadata:
annotations:
kubernetes-kaito.sh/service-type: load-balancer
name: workspace-llama-2-70b-chat
resource:
instanceType: "Standard_NC96ads_A100_v4"
Expand All @@ -13,3 +11,6 @@ resource:
inference:
preset:
name: "llama-2-70b-chat"
accessMode: private
presetOptions:
image: <YOUR IMAGE URL>
5 changes: 3 additions & 2 deletions examples/kaito_workspace_llama2_7b-chat.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
apiVersion: kaito.sh/v1alpha1
kind: Workspace
metadata:
annotations:
kubernetes-kaito.sh/service-type: load-balancer
name: workspace-llama-2-7b-chat
resource:
instanceType: "Standard_NC12s_v3"
Expand All @@ -12,3 +10,6 @@ resource:
inference:
preset:
name: "llama-2-7b-chat"
accessMode: private
presetOptions:
image: <YOUR IMAGE URL>

0 comments on commit 413de7a

Please sign in to comment.