Agent Morpheus Self hosted Models.

This repo contains umbrella helm chart to install embedding model nim-embed for creating embeddings to store in VDB and one of the following LLM:

llama3.1-70b-instruct-4bit
nim llama3.1-8b-instruct(16 bit quantization).

Deploying the chart

Create target namespace to install on it all models.

oc new-project agent-morpheus-models

Type in your NGC_API_KEY ( get one here)

export NGC_API_KEY=your_api_key_goes_here

Replace placeholder password with your real API Key

sed -E 's/ \&ngc-api-key changeme/ \&ngc-api-key '$NGC_API_KEY'/' agent-morpheus-models/values.yaml  > agent-morpheus-models/yourenv_values.yaml

Deploying both LLMs together is not possible, when trying doing so, you'll get an error from the chart installation:

helm install --set llama3_1_70b_instruct_4bit.enabled=true --set nim_llm.enabled=true  agent-morpheus-models agent-morpheus-models/ -f agent-morpheus-models/yourenv_values.yaml

Output:

Error: INSTALLATION FAILED: execution error at (agent-morpheus-models/templates/configmap.yaml:6:3): Only one of models should be deployed!, either llama3_1_70b_instruct_4bit or nim_llm 8b, but not both!

Deploy the chart with one of the two possible combinations:

# Deploy with LLM llama3.1-70b-instruct-4bit
 helm install agent-morpheus-models agent-morpheus-models/ -f agent-morpheus-models/yourenv_values.yaml
# Or Deploy with LLM meta/llama3.1-8b-instruct ( 16bit quantization)
 helm install --set llama3_1_70b_instruct_4bit.enabled=false --set nim_llm.enabled=true  agent-morpheus-models agent-morpheus-models/ -f agent-morpheus-models/yourenv_values.yaml

Output:

NAME: agent-morpheus-models
LAST DEPLOYED: Sun Dec  8 23:05:14 2024
NAMESPACE: test-models
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Send a prompt to the model to test it works:
oc wait --for=condition=ready pod -l component=llama3.1-70b-instruct  --timeout 1000s
curl -X POST -H "Content-Type: application/json" http://llama3-1-70b-instruct-4bit-agent-morpheus-models.apps.ai-dev03.kni.syseng.devcluster.openshift.com/v1/chat/completions -d @$(git rev-parse --show-toplevel)/agent-morpheus-models/files/70b-4bit-input-example.json | jq .

Wait for LLM pod to be ready, and then send an example request to the LLM, in order to get output

oc wait --for=condition=ready pod -l component=llama3.1-70b-instruct  --timeout 1000s
curl -X POST -H "Content-Type: application/json" http://llama3-1-70b-instruct-4bit-agent-morpheus-models.apps.ai-dev03.kni.syseng.devcluster.openshift.com/v1/chat/completions -d @$(git rev-parse --show-toplevel)/agent-morpheus-models/files/70b-4bit-input-example.json | jq .

Whenever finishing with models , and wants to free up resources, you can delete the chart

helm uninstall agent-morpheus-models

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.idea		.idea
agent-morpheus-models		agent-morpheus-models
README.md		README.md
agent-morpheus-models.iml		agent-morpheus-models.iml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Morpheus Self hosted Models.

Deploying the chart

About

Releases

Packages

Languages

RHEcosystemAppEng/agent-morpheus-models

Folders and files

Latest commit

History

Repository files navigation

Agent Morpheus Self hosted Models.

Deploying the chart

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages