-
Notifications
You must be signed in to change notification settings - Fork 53
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* first commit, new llm project * add indexing functionality * finish basic pipeline functionality * update llm_utils and format * refactored the url scraper + utils * refactoring part 2 * fix DB update functionality * add option to switch out the llm within the CLI * use litellm and drop garbage logs * formatting * remove unused title + url * rip out langchain completely * error handling and debug statements * add code inspo acknowledgements * add and update docstrings * remove unused code and use zenml urls * use smaller embedding model * update the dimensionality to match the new embedding model * no cache for embeddings generation * fix constant * visualise embeddings * tiny tweaks to params * add images * update pipeline code to abstract out DB creds * add images * final README updates * add RAG pipeline image * formatting * add super simple RAG pipeline * even more basic RAG * add a third irrelevant question * Refactor preprocess_text and answer_question functions
- Loading branch information
Showing
22 changed files
with
1,186 additions
and
0 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
* | ||
!/pipelines/** | ||
!/steps/** | ||
!/materializers/** | ||
!/evaluate/** | ||
!/finetune/** | ||
!/generate/** | ||
!/lit_gpt/** | ||
!/scripts/** |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
Apache Software License 2.0 | ||
|
||
Copyright (c) ZenML GmbH 2024. All rights reserved. | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,144 @@ | ||
# 🦜 Production-ready RAG pipelines for chat applications | ||
|
||
This project showcases how you can work up from a simple RAG pipeline to a more complex setup that | ||
involves finetuning embeddings, reranking retrieved documents, and even finetuning the | ||
LLM itself. We'll do this all for a use case relevant to ZenML: a question | ||
answering system that can provide answers to common questions about ZenML. This | ||
will help you understand how to apply the concepts covered in this guide to your | ||
own projects. | ||
|
||
![](.assets/rag-pipeline-zenml-cloud.png) | ||
|
||
Contained within this project is all the code needed to run the full pipelines. | ||
You can follow along [in our guide](https://docs.zenml.io/user-guide/llmops-guide/) to understand the decisions and tradeoffs | ||
behind the pipeline and step code contained here. You'll build a solid understanding of how to leverage | ||
LLMs in your MLOps workflows using ZenML, enabling you to build powerful, | ||
scalable, and maintainable LLM-powered applications. | ||
|
||
This project contains all the pipeline and step code necessary to follow along | ||
with the guide. You'll need a PostgreSQL database to store the embeddings; full | ||
instructions are provided below for how to set that up. | ||
|
||
## 🙏🏻 Inspiration and Credit | ||
|
||
The RAG pipeline relies on code from [this Timescale | ||
blog](https://www.timescale.com/blog/postgresql-as-a-vector-database-create-store-and-query-openai-embeddings-with-pgvector/) | ||
that showcased using PostgreSQL as a vector database. We adapted it for our use | ||
case and adapted it to work with Supabase. | ||
|
||
## 🏃 How to run | ||
|
||
This project showcases production-ready pipelines so we use some cloud | ||
infrastructure to manage the assets. You can run the pipelines locally using a | ||
local PostgreSQL database, but we encourage you to use a cloud database for | ||
production use cases. | ||
|
||
### Connecting to ZenML Cloud | ||
|
||
If you run the pipeline using ZenML Cloud you'll have access to the managed | ||
dashboard which will allow you to get started quickly. We offer a free trial so | ||
you can try out the platform without any cost. Visit the [ZenML Cloud | ||
dashboard](https://cloud.zenml.io/) to get started. | ||
|
||
### Setting up Supabase | ||
|
||
[Supabase](https://supabase.com/) is a cloud provider that provides a PostgreSQL database. It's simple to | ||
use and has a free tier that should be sufficient for this project. Once you've | ||
created a Supabase account and organisation, you'll need to create a new | ||
project. | ||
|
||
![](.assets/supabase-create-project.png) | ||
|
||
You'll then want to connect to this database instance by getting the connection | ||
string from the Supabase dashboard. | ||
|
||
![](.assets/supabase-connection-string.png) | ||
|
||
You'll then use these details to populate some environment variables where the pipeline code expects them: | ||
|
||
```shell | ||
export ZENML_SUPABASE_USER=<your-supabase-user> | ||
export ZENML_SUPABASE_HOST=<your-supabase-host> | ||
export ZENML_SUPABASE_PORT=<your-supabase-port> | ||
``` | ||
|
||
You'll want to save the Supabase database password as a ZenML secret so that it | ||
isn't stored in plaintext. You can do this by running the following command: | ||
|
||
```shell | ||
zenml secret create supabase_postgres_db --password="YOUR_PASSWORD" | ||
``` | ||
|
||
### Running the RAG pipeline | ||
|
||
To run the pipeline, you can use the `run.py` script. This script will allow you | ||
to run the pipelines in the correct order. You can run the script with the | ||
following command: | ||
|
||
```shell | ||
python run.py --basic-rag | ||
``` | ||
|
||
This will run the basic RAG pipeline, which scrapes the ZenML documentation and stores the embeddings in the Supabase database. | ||
|
||
### Querying your RAG pipeline assets | ||
|
||
Once the pipeline has run successfully, you can query the assets in the Supabase | ||
database using the `--rag-query` flag as well as passing in the model you'd like | ||
to use for the LLM. | ||
|
||
In order to use the default LLM for this query, you'll need an account | ||
and an API key from OpenAI specified as another environment variable: | ||
|
||
```shell | ||
export OPENAI_API_KEY=<your-openai-api-key> | ||
``` | ||
|
||
When you're ready to make the query, run the following command: | ||
|
||
```shell | ||
python run.py --rag-query "how do I use a custom materializer inside my own zenml steps? i.e. how do I set it? inside the @step decorator?" --model=gpt4 | ||
``` | ||
|
||
Alternative options for LLMs to use include: | ||
|
||
- `gpt4` | ||
- `gpt35` | ||
- `claude3` | ||
- `claudehaiku` | ||
|
||
Note that Claude will require a different API key from Anthropic. See [the | ||
`litellm` docs](https://docs.litellm.ai/docs/providers/anthropic) on how to set this up. | ||
|
||
## ☁️ Running with a remote stack | ||
|
||
The basic RAG pipeline will run using a local stack, but if you want to improve | ||
the speed of the embeddings step you might want to consider using a cloud | ||
orchestrator. Please follow the instructions in [our basic cloud setup guides](https://docs.zenml.io/user-guide/cloud-guide) | ||
(currently available for [AWS](https://docs.zenml.io/user-guide/cloud-guide/aws-guide) and [GCP](https://docs.zenml.io/user-guide/cloud-guide/gcp-guide)) to learn how you can run the pipelines on | ||
a remote stack. | ||
|
||
## 📜 Project Structure | ||
|
||
The project loosely follows [the recommended ZenML project structure](https://docs.zenml.io/user-guide/starter-guide/follow-best-practices): | ||
|
||
``` | ||
. | ||
├── LICENSE # License file | ||
├── README.md # This file | ||
├── constants.py # Constants for the project | ||
├── pipelines | ||
│ ├── __init__.py | ||
│ └── llm_basic_rag.py # Basic RAG pipeline | ||
├── requirements.txt # Requirements file | ||
├── run.py # Script to run the pipelines | ||
├── steps | ||
│ ├── __init__.py | ||
│ ├── populate_index.py # Step to populate the index | ||
│ ├── url_scraper.py # Step to scrape the URLs | ||
│ ├── url_scraping_utils.py # Utilities for the URL scraper | ||
│ └── web_url_loader.py # Step to load the URLs | ||
└── utils | ||
├── __init__.py | ||
└── llm_utils.py # Utilities related to the LLM | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# Vector Store constants | ||
CHUNK_SIZE = 500 | ||
CHUNK_OVERLAP = 50 | ||
EMBEDDING_DIMENSIONALITY = ( | ||
384 # Update this to match the dimensionality of the new model | ||
) | ||
|
||
# Scraping constants | ||
RATE_LIMIT = 5 # Maximum number of requests per second | ||
|
||
# LLM Utils constants | ||
OPENAI_MODEL = "gpt-3.5-turbo" | ||
EMBEDDINGS_MODEL = "sentence-transformers/all-MiniLM-L12-v2" | ||
MODEL_NAME_MAP = { | ||
"gpt4": "gpt-4-0125-preview", | ||
"gpt35": "gpt-3.5-turbo", | ||
"claude3": "claude-3-opus-20240229", | ||
"claudehaiku": "claude-3-haiku-20240307", | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
# Apache Software License 2.0 | ||
# | ||
# Copyright (c) ZenML GmbH 2024. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
import os | ||
import re | ||
import string | ||
|
||
from openai import OpenAI | ||
|
||
|
||
def preprocess_text(text): | ||
text = text.lower() | ||
text = text.translate(str.maketrans("", "", string.punctuation)) | ||
text = re.sub(r"\s+", " ", text).strip() | ||
return text | ||
|
||
|
||
def tokenize(text): | ||
return preprocess_text(text).split() | ||
|
||
|
||
def retrieve_relevant_chunks(query, corpus, top_n=2): | ||
query_tokens = set(tokenize(query)) | ||
similarities = [] | ||
for chunk in corpus: | ||
chunk_tokens = set(tokenize(chunk)) | ||
similarity = len(query_tokens.intersection(chunk_tokens)) / len( | ||
query_tokens.union(chunk_tokens) | ||
) | ||
similarities.append((chunk, similarity)) | ||
similarities.sort(key=lambda x: x[1], reverse=True) | ||
return [chunk for chunk, _ in similarities[:top_n]] | ||
|
||
|
||
def answer_question(query, corpus, top_n=2): | ||
relevant_chunks = retrieve_relevant_chunks(query, corpus, top_n) | ||
if not relevant_chunks: | ||
return "I don't have enough information to answer the question." | ||
|
||
context = "\n".join(relevant_chunks) | ||
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY")) | ||
chat_completion = client.chat.completions.create( | ||
messages=[ | ||
{ | ||
"role": "system", | ||
"content": f"Based on the provided context, answer the following question: {query}\n\nContext:\n{context}", | ||
}, | ||
{ | ||
"role": "user", | ||
"content": query, | ||
}, | ||
], | ||
model="gpt-3.5-turbo", | ||
) | ||
|
||
return chat_completion.choices[0].message.content.strip() | ||
|
||
|
||
# Sci-fi themed corpus about "ZenML World" | ||
corpus = [ | ||
"The luminescent forests of ZenML World are inhabited by glowing Zenbots that emit a soft, pulsating light as they roam the enchanted landscape.", | ||
"In the neon skies of ZenML World, Cosmic Butterflies flutter gracefully, their iridescent wings leaving trails of stardust in their wake.", | ||
"Telepathic Treants, ancient sentient trees, communicate through the quantum neural network that spans the entire surface of ZenML World, sharing wisdom and knowledge.", | ||
"Deep within the melodic caverns of ZenML World, Fractal Fungi emit pulsating tones that resonate through the crystalline structures, creating a symphony of otherworldly sounds.", | ||
"Near the ethereal waterfalls of ZenML World, Holographic Hummingbirds hover effortlessly, their translucent wings refracting the prismatic light into mesmerizing patterns.", | ||
"Gravitational Geckos, masters of anti-gravity, traverse the inverted cliffs of ZenML World, defying the laws of physics with their extraordinary abilities.", | ||
"Plasma Phoenixes, majestic creatures of pure energy, soar above the chromatic canyons of ZenML World, their fiery trails painting the sky in a dazzling display of colors.", | ||
"Along the prismatic shores of ZenML World, Crystalline Crabs scuttle and burrow, their transparent exoskeletons refracting the light into a kaleidoscope of hues.", | ||
] | ||
|
||
# Preprocess the corpus | ||
corpus = [preprocess_text(sentence) for sentence in corpus] | ||
|
||
# Ask questions | ||
question1 = "What are Plasma Phoenixes?" | ||
answer1 = answer_question(question1, corpus) | ||
print(f"Question: {question1}") | ||
print(f"Answer: {answer1}") | ||
|
||
question2 = ( | ||
"What kinds of creatures live on the prismatic shores of ZenML World?" | ||
) | ||
answer2 = answer_question(question2, corpus) | ||
print(f"Question: {question2}") | ||
print(f"Answer: {answer2}") | ||
|
||
irrelevant_question_3 = "What is the capital of Panglossia?" | ||
answer3 = answer_question(irrelevant_question_3, corpus) | ||
print(f"Question: {irrelevant_question_3}") | ||
print(f"Answer: {answer3}") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Apache Software License 2.0 | ||
# | ||
# Copyright (c) ZenML GmbH 2024. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
from pipelines.llm_basic_rag import llm_basic_rag |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
from steps.populate_index import ( | ||
generate_embeddings, | ||
index_generator, | ||
preprocess_documents, | ||
) | ||
from steps.url_scraper import url_scraper | ||
from steps.web_url_loader import web_url_loader | ||
from zenml import pipeline | ||
|
||
|
||
@pipeline | ||
def llm_basic_rag() -> None: | ||
"""Executes the pipeline to train a basic RAG model. | ||
This function performs the following steps: | ||
1. Scrapes URLs using the url_scraper function. | ||
2. Loads documents from the scraped URLs using the web_url_loader function. | ||
3. Preprocesses the loaded documents using the preprocess_documents function. | ||
4. Generates embeddings for the preprocessed documents using the generate_embeddings function. | ||
5. Generates an index for the embeddings and documents using the index_generator function. | ||
""" | ||
urls = url_scraper() | ||
docs = web_url_loader(urls=urls) | ||
processed_docs = preprocess_documents(documents=docs) | ||
embeddings = generate_embeddings(split_documents=processed_docs) | ||
index_generator(embeddings=embeddings, documents=docs) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
zenml | ||
langchain-community | ||
ratelimit | ||
langchain>=0.0.325 | ||
langchain-openai | ||
pgvector | ||
psycopg2-binary | ||
beautifulsoup4 | ||
unstructured | ||
pandas | ||
numpy | ||
sentence-transformers | ||
litellm |
Oops, something went wrong.