Notebook and data for NVIDIA-NIM cross skilling session
To run the notebook setup the environment in conda using:
conda env create -f env.yaml -p ./xskill-rag
then:
conda activate ./xskill-rag
then start a notebook server and load the notebook in the browser:
jupyter notebook
- Go to: https://org.ngc.nvidia.com/setup/personal-keys and set up an API key (also create an NVIDIA Account if it doesn't already exist).
- When creating an NGC API key, ensure that at least “NGC Catalog” & “Public API” are selected from the “Services Included” dropdown.
- At the command line (on MPC - 10.167.67.78):
export NGC_API_KEY=<your-api-key>
to set the API key as a required environment variable. - Log in to the NVIDIA NGC docker repo:
echo "$NGC_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin
- Install NGC CLI following instructions here: https://org.ngc.nvidia.com/setup/installers/cli
- Add NGC CLI to the path with:
export PATH=$PATH:$(pwd)/ngc-cli
- Configure NGC CLI:
ngc config set
- Check the list of available images:
ngc registry image list --format_type ascii nvcr.io/nim/*
- Get further info for a particular image:
ngc registry image info --format_type ascii nim/meta/llama-3.1-8b-instruct:latest
- Run the contents of setup_MPC on the MPC to run the docker image for the
llama-3.1-8b-instruct:latest
NIM. - Check connection with:
curl -X 'POST' \
"http://10.167.67.78:8000/v1/completions" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d '{"model": "meta/llama3-8b-instruct", "prompt": "Describe Retreival-Augmented Generation", "max_tokens": 64}'