Objective

Develop a Retrieval-Augmented Generation (RAG) based AI system capable of answering questions about yourself. I have my resume under the data/ folder(you can keep any number of pdf files under data/ maybe personal or someting related to work). The objective is to create a simple RAG agent that will answer questions based on data and LLM. Short steps are -

load the personal data and split it into chunks(pages)
for each chunk
- get embeddings from a LLM model
- index in a vector db
user query via streamlit
- retrieve context i.e relevant entry from vector db given the query
- use context + query to create a prompt
- pass the prompt to a LLM model

Installation

Clone the repository.
Create a conda environment and activate it.
Install the required dependencies by running the following command:
```
pip install -r requirements.txt
```
install ollama to locally run LLM models
keep the data under data/ folder
to invoke llm models, we need to first downlaod the model and then run ollama locally -
```
ollama pull llama3
ollama serve
```
run the following command -
```
streamlit run app.py
```

Some details about code -

initialize the RAGSystem class
- load the data into a format
  - load the pdfs as list of pages
  - pages are further split into chunks; for better representation
    - change the chunk_size, chunk_overlap according to choice
    - other splitters are available by langchain here
  - set a unique id for each chunk
- initialize a vectordb where will store chunks and their respective embeddings
  - we use chromadb
  - embeddings are from llama3 model;
- add chunks to vector db
user inputs a query
- calls answer_query
- retrevies context from vector db based on query
  - [improvement] instead of just using whatever entries are sent by vectordb; we can use a reranker to rerank the entries wrt our query for better context
  - langchain provides lot of retrievers
- create a prompt using context + query
- invoke LLM(llama3) with prompt
- format and send back response

possible methods of evaluating the quality [to do]

we can have LLM judge our response given a evaluation prompt

eval_prompt = 
"""
    Expected Response: {expected_response}
    Actual Response: {actual_response}
    ---
    On a scale of 1 to 5, with 5 being identical, how well does the actual response match the expected response? 
"""

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
util		util
Readme.md		Readme.md
app.py		app.py
rag_system.py		rag_system.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Objective

Installation

Some details about code -

possible methods of evaluating the quality [to do]

to maintain context of chats [to do]

example of a query about personal data

About

Releases

Packages

Languages

deekshakoul/RAG-with-Langchain-Ollama-for-pdfs

Folders and files

Latest commit

History

Repository files navigation

Objective

Installation

Some details about code -

possible methods of evaluating the quality [to do]

to maintain context of chats [to do]

example of a query about personal data

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages