minor edits to the post

appdevgbb · Apr 16, 2024 · 0cd7356 · 0cd7356
1 parent 286acc2
commit 0cd7356
Showing 1 changed file with 4 additions and 5 deletions.
diff --git a/docs/_posts/2024-04-16-aks-kaito.md b/docs/_posts/2024-04-16-aks-kaito.md
@@ -9,9 +9,9 @@ authors:
 
 The Kubernetes AI Toolchain Operator, also known as Project KAITO, is a open-source solution to simplify the deployment of inference models in a Kubernetes cluster. In particular, the focus is on simplifying the operation of the most popular models available (ex. Falcon, Mistral and Llama2).
 
-KAITO provides operators to manage validation of the requested model against the requested nodepool hardware, deployment of the nodepool and the deployment of the model itself along with a rest endpoint to reach the model.
+KAITO provides operators to manage validation of the requested model against the requested nodepool hardware, deployment of the nodepool and the deployment of the model itself along with a REST endpoint to reach the model.
 
-In this walkthrough we'll deploy an AKS cluster with the KAITO managed add-on. Next we'll deploy and test an infrenece model, which we'll pull from our own private Container Registry. We'll be following the setup guide from the AKS product docs [here](https://learn.microsoft.com/en-us/azure/aks/ai-toolchain-operator) with some of my own customizations and extensions to simplify tasks.
+In this walkthrough we'll deploy an AKS cluster with the KAITO managed add-on. Next we'll deploy and test an infrenece model, which we'll pull from our own private container registry. We'll be following the setup guide from the AKS product docs [here](https://learn.microsoft.com/en-us/azure/aks/ai-toolchain-operator) with some of my own customizations and extensions to simplify tasks.
 
 ## Cluster Creation
 
@@ -25,7 +25,6 @@ RG=KaitoLab
 LOC=westus3
 ACR_NAME=kaitolab
 CLUSTER_NAME=kaitocluster
-GPU_POOL_SKU=Standard_NC16as_T4_v3
 
 # Create the resource group
 az group create -n $RG -l $LOC
@@ -113,7 +112,7 @@ az aks update -g $RG -n $CLUSTER_NAME --attach-acr $ACR_NAME
 
 ## Deploy a model!
 
-Now that our cluster and registry are all set, we're ready to deploy our first model. We'll generate our 'Workspace' manifest ourselves, but you can also pull from the [examples](https://github.com/Azure/kaito/blob/main/presets/README.md) in the KAITO repo and update as needed.
+Now that our cluster and registry are all set, we're ready to deploy our first model. We'll generate our 'Workspace' manifest ourselves, but you can also pull from the [examples](https://github.com/Azure/kaito/blob/main/presets/README.md) in the KAITO repo and update as needed. The model below is actually directly from the examples, however I added the 'presetOptions' section to set the source of the model image.
 
 >**NOTE:** Make sure you validate you have quota on the target subscription for the machine type you select below.
 
@@ -154,7 +153,7 @@ watch kubectl get workspace,nodes,svc,pods
 
 Now that our model is running, we can send it a request. By default the model is only accessible via a ClusterIP inside the Kubernetes cluster, so you'll need to access the endpoint from a test pod. We'll use a public 'curl' image, but you can use whatever you prefer.
 
-You do have the option to expose the model via a Kubernetes Service of type 'LoadBalancer' via the workspace configuration, but that generally isnt recommended.
+You do have the option to expose the model via a Kubernetes Service of type 'LoadBalancer' via the workspace configuration, but that generally isnt recommended. Typically you'd be calling the model from another service inside the cluster, or placing the endpoint behind an ingress controller.
 
 ```bash
 # Get the model cluster IP