From e0d5be41fe4eafc830409c8d3460de0fc793d724 Mon Sep 17 00:00:00 2001 From: Matthew Farrellee Date: Tue, 10 Dec 2024 16:23:56 -0500 Subject: [PATCH] add nvidia nim inference provider to docs (#534) # What does this PR do? add [NVIDIA NIM](https://build.nvidia.com/nim?filters=nimType%3Anim_type_run_anywhere&q=llama) reference to the docs ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [x] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. --- README.md | 1 + docs/source/concepts/index.md | 2 +- docs/source/index.md | 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index f60069e45a..147e2d3797 100644 --- a/README.md +++ b/README.md @@ -86,6 +86,7 @@ Additionally, we have designed every element of the Stack such that APIs as well | Together | Hosted | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | | | Ollama | Single Node | | :heavy_check_mark: | | | | TGI | Hosted and Single Node | | :heavy_check_mark: | | | +| [NVIDIA NIM](https://build.nvidia.com/nim?filters=nimType%3Anim_type_run_anywhere&q=llama) | Hosted and Single Node | | :heavy_check_mark: | | | | Chroma | Single Node | | | :heavy_check_mark: | | | | PG Vector | Single Node | | | :heavy_check_mark: | | | | PyTorch ExecuTorch | On-device iOS | :heavy_check_mark: | :heavy_check_mark: | | | diff --git a/docs/source/concepts/index.md b/docs/source/concepts/index.md index eccd90b7c0..d7c88cbf94 100644 --- a/docs/source/concepts/index.md +++ b/docs/source/concepts/index.md @@ -58,7 +58,7 @@ While there is a lot of flexibility to mix-and-match providers, often users will **Remotely Hosted Distro**: These are the simplest to consume from a user perspective. You can simply obtain the API key for these providers, point to a URL and have _all_ Llama Stack APIs working out of the box. Currently, [Fireworks](https://fireworks.ai/) and [Together](https://together.xyz/) provide such easy-to-consume Llama Stack distributions. -**Locally Hosted Distro**: You may want to run Llama Stack on your own hardware. Typically though, you still need to use Inference via an external service. You can use providers like HuggingFace TGI, Cerebras, Fireworks, Together, etc. for this purpose. Or you may have access to GPUs and can run a [vLLM](https://github.com/vllm-project/vllm) instance. If you "just" have a regular desktop machine, you can use [Ollama](https://ollama.com/) for inference. To provide convenient quick access to these options, we provide a number of such pre-configured locally-hosted Distros. +**Locally Hosted Distro**: You may want to run Llama Stack on your own hardware. Typically though, you still need to use Inference via an external service. You can use providers like HuggingFace TGI, Cerebras, Fireworks, Together, etc. for this purpose. Or you may have access to GPUs and can run a [vLLM](https://github.com/vllm-project/vllm) or [NVIDIA NIM](https://build.nvidia.com/nim?filters=nimType%3Anim_type_run_anywhere&q=llama) instance. If you "just" have a regular desktop machine, you can use [Ollama](https://ollama.com/) for inference. To provide convenient quick access to these options, we provide a number of such pre-configured locally-hosted Distros. **On-device Distro**: Finally, you may want to run Llama Stack directly on an edge device (mobile phone or a tablet.) We provide Distros for iOS and Android (coming soon.) diff --git a/docs/source/index.md b/docs/source/index.md index ee7f00e0aa..5d7499a047 100644 --- a/docs/source/index.md +++ b/docs/source/index.md @@ -44,6 +44,7 @@ A number of "adapters" are available for some popular Inference and Memory (Vect | Together | Hosted | Y | Y | | Y | | | Ollama | Single Node | | Y | | | | TGI | Hosted and Single Node | | Y | | | +| [NVIDIA NIM](https://build.nvidia.com/nim?filters=nimType%3Anim_type_run_anywhere&q=llama) | Hosted and Single Node | | Y | | | | Chroma | Single Node | | | Y | | | | Postgres | Single Node | | | Y | | | | PyTorch ExecuTorch | On-device iOS | Y | Y | | |