Merge branch 'main' into feature/demo-gitflow-complete-llm

zenml-io · Nov 4, 2024 · e87f0f4 · e87f0f4
2 parents cfc5567 + b20d5af
commit e87f0f4
Show file tree

Hide file tree

Showing 9 changed files with 653 additions and 28 deletions.
diff --git a/llm-complete-guide/.assets/huggingface-space-rag-deployment.png b/llm-complete-guide/.assets/huggingface-space-rag-deployment.png
diff --git a/llm-complete-guide/README.md b/llm-complete-guide/README.md
@@ -57,9 +57,9 @@ export ZENML_PROJECT_SECRET_NAME=llm-complete
 
 ### Setting up Supabase
 
-[Supabase](https://supabase.com/) is a cloud provider that provides a PostgreSQL
+[Supabase](https://supabase.com/) is a cloud provider that offers a PostgreSQL
 database. It's simple to use and has a free tier that should be sufficient for
-this project. Once you've created a Supabase account and organisation, you'll
+this project. Once you've created a Supabase account and organization, you'll
 need to create a new project.
 
 ![](.assets/supabase-create-project.png)
@@ -76,7 +76,7 @@ string from the Supabase dashboard.
 
 ![](.assets/supabase-connection-string.png)
 
-In case supabase is not an option for you, you can use a different database as the backend. 
+In case Supabase is not an option for you, you can use a different database as the backend.
 
 ### Running the RAG pipeline
 
@@ -114,6 +114,51 @@ Note that Claude will require a different API key from Anthropic. See [the
 `litellm` docs](https://docs.litellm.ai/docs/providers/anthropic) on how to set
 this up.
 
+### Deploying the RAG pipeline
+
+![](.assets/huggingface-space-rag-deployment.png)
+
+You'll need to update and add some secrets to make this work with your Hugging
+Face account. To get your ZenML service account API token and store URL, you can
+first create a new service account:
+
+```bash
+zenml service-account create <SERVICE_ACCOUNT_NAME>
+```
+
+For more information on this part of the process, please refer to the [ZenML
+documentation](https://docs.zenml.io/how-to/project-setup-and-management/connecting-to-zenml/connect-with-a-service-account).
+
+Once you have your service account API token and store URL (the URL of your
+deployed ZenML tenant), you can update the secrets with the following command:
+
+```bash
+zenml secret update llm-complete --zenml_api_token=<YOUR_ZENML_SERVICE_ACCOUNT_API_TOKEN> --zenml_store_url=<YOUR_ZENML_STORE_URL>
+```
+
+To set the Hugging Face user space that gets used for the Gradio app deployment,
+you should set an environment variable with the following command:
+
+```bash
+export ZENML_HF_USERNAME=<YOUR_HF_USERNAME>
+export ZENML_HF_SPACE_NAME=<YOUR_HF_SPACE_NAME> # optional, defaults to "llm-complete-guide-rag"
+```
+
+To deploy the RAG pipeline, you can use the following command:
+
+```shell
+python run.py --deploy
+```
+
+Alternatively, you can run the basic RAG pipeline *and* deploy it in one go:
+
+```shell
+python run.py --rag --deploy
+```
+
+This will open a Hugging Face space in your browser where you can interact with
+the RAG pipeline.
+
 ### Run the LLM RAG evaluation pipeline
 
 To run the evaluation pipeline, you can use the following command:
@@ -157,7 +202,6 @@ will need to change the hf repo urls to a space you have permissions to.
 zenml secret update llm-complete -v '{"argilla_api_key": "YOUR_ARGILLA_API_KEY", "argilla_api_url": "YOUR_ARGILLA_API_URL", "hf_token": "YOUR_HF_TOKEN"}'
 ```
 
-
 ### Finetune the embeddings
 
 As with the previous pipeline, you will need to have set up and connected to an Argilla instance for this

diff --git a/llm-complete-guide/deployment_hf.py b/llm-complete-guide/deployment_hf.py
@@ -0,0 +1,13 @@
+import gradio as gr
+from utils.llm_utils import process_input_with_retrieval
+
+
+def predict(message, history):
+    return process_input_with_retrieval(
+        input=message,
+        n_items_retrieved=20,
+        use_reranking=True,
+    )
+
+
+gr.ChatInterface(predict, type="messages").launch()
diff --git a/llm-complete-guide/pipelines/llm_basic_rag.py b/llm-complete-guide/pipelines/llm_basic_rag.py
@@ -38,6 +38,6 @@ def llm_basic_rag() -> None:
     """
     urls = url_scraper()
     docs = web_url_loader(urls=urls)
-    processed_docs = preprocess_documents(documents=docs)
+    processed_docs, _, _ = preprocess_documents(documents=docs)
     embedded_docs = generate_embeddings(split_documents=processed_docs)
     index_generator(documents=embedded_docs)
diff --git a/llm-complete-guide/requirements.txt b/llm-complete-guide/requirements.txt
@@ -1,13 +1,11 @@
 zenml[server]>=0.68.1
-langchain-community
 ratelimit
-langchain>=0.0.325
-langchain-openai
 pgvector
 psycopg2-binary
 beautifulsoup4
 unstructured
 pandas
+openai
 numpy
 sentence-transformers>=3
 transformers

diff --git a/llm-complete-guide/run.py b/llm-complete-guide/run.py
@@ -117,7 +117,7 @@
     "--config",
     "config",
     default=None,
-    help="Generate chunks for Hugging Face dataset",
+    help="Path to config",
 )
 def main(
     pipeline: str,