Replies: 6 comments 9 replies
-
To address the issues you're encountering with your RAG chatbot, let's tackle them one by one:
By addressing these specific areas, you should be able to improve the performance and accuracy of your chatbot.
|
Beta Was this translation helpful? Give feedback.
-
@dosu can you write the whole code from the beginning like from storing on index and not using openai. and can you also code it with streamlit |
Beta Was this translation helpful? Give feedback.
-
@dosu can you give me a simple example of context chat engine using huggingface interface API without it throwing a error of message should be of odd length |
Beta Was this translation helpful? Give feedback.
-
@logan-markewich @Thomas-AH-Heller pls help just getting the same error of message should be in odd length when the same code is working for condense question chat engine but i want to use the context chat engine as i want the chatbot to only answer from the context. Please help I read all the documents even that is not helping i kindly help even a update document that shows an example for local LLM or HuggingfaceInterface API would work wonders for me PLS PLS PLS help |
Beta Was this translation helpful? Give feedback.
-
This works fine with llama_index\llms\huggingface\base.py", line 450, in chat_messages_to_conversational_kwargs Seems to be issue with validation put in huggingface\base.py. I am using huggingfaceInferenceAPI as llm. |
Beta Was this translation helpful? Give feedback.
-
I have had a similar issue and I think the issue might be the use of HuggingFaceInferenceAPI. My theory is that the API expects a single query to get the response from the LLM but when using chat engine it sends multiple queries at once which is causing the error. Again its my own speculations, I dont know for sure. |
Beta Was this translation helpful? Give feedback.
-
Question
I am trying to create a RAG chatbot that only answers from the content that i provide through pdf and never uses its own knowledge to answer or add information to the answer that is outside of the context. At this moment i am facing three issues:-
the answer length is quite small
the follow up question that it creates is totally of the mark
model overload issues
As i am creating a RAG chatbot for e-book summarization the answers are meant to be lengthy. I have used HuggingFace for embeddings as well as for the LLM. There was very less docs to refer to when using CondenseQuestionChatEngine so add to learn for myself. I thought i had solved it then i realize it was running on default llm which is openai, and after solving that issue all the prompts that i created for the Condensequestion started to fail, now if i run the question in query engine and print it i get the right answers. but the chat engine when trying to make sense using the chat history it completely goes of the grid and ask the wrong question all together and way it is setup the bot will revert with i don't know.
now i tried to switch to context chat engine just to compare the results its giving me an error:-
NotImplementedError("Messages passed in must be of odd length.")
here is my code for reference:-
`
import streamlit as st
from llama_index.core import Settings
import logging
import sys
import os.path
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
from llama_index.core.service_context import set_global_service_context
from llama_index.llms.llama_cpp import LlamaCPP
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from langchain.embeddings.huggingface import HuggingFaceBgeEmbeddings
from langchain_community.llms.huggingface_endpoint import HuggingFaceEndpoint
from llama_cpp import Llama
from langchain.llms import HuggingFaceHub
from llama_index.core.prompts.chat_prompts import ChatPromptTemplate, ChatMessage,MessageRole
from llama_index.core.chat_engine import CondenseQuestionChatEngine, ContextChatEngine
from llama_index.legacy.prompts import ChatPromptTemplate
from llama_index.core.base.llms.types import ChatMessage, MessageRole
try:
from llama_index import VectorStoreIndex, ServiceContext, Document, SimpleDirectoryReader, StorageContext, load_index_from_storage
except ImportError:
from llama_index.core import VectorStoreIndex, ServiceContext, Document, SimpleDirectoryReader, StorageContext, load_index_from_storage
from llama_index.llms.huggingface import (
HuggingFaceInferenceAPI,
HuggingFaceLLM,
)
from llama_index.llms.llama_cpp.llama_utils import (
messages_to_prompt,
completion_to_prompt,
)
from huggingface_hub import login
login("MY_HUGGINGFACE_API_KEY")
from transformers import AutoTokenizer
st.set_page_config(page_title="Chat with the Streamlit docs, powered by LlamaIndex", page_icon="🦙", layout="centered", initial_sidebar_state="auto", menu_items=None)
st.title("Chat with the Streamlit docs, powered by LlamaIndex 💬🦙")
st.info("Check out the full tutorial to build this app in our blog post", icon="📃")
from llama_index.core import PromptTemplate
prompt_template = """### System: Use the following pieces of information to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Only return the helpful answer below and nothing else.
Helpful answer:
"""
if "messages" not in st.session_state.keys(): # Initialize the chat messages history
st.session_state.messages = [
{"role": "assistant", "content": "Ask me a question about the E-books!"}
]
PERSIST_DIR = "./storage"
@st.cache_resource(show_spinner=False)
def load_data():
with st.spinner(text="Loading and indexing the E-books – hang tight! This should take 1-2 minutes."):
reader = SimpleDirectoryReader(input_dir="./data", recursive=True)
docs = reader.load_data()
llm = HuggingFaceInferenceAPI(
generate_kwargs={"temperature": 0.0},
model_name="meta-llama/Llama-2-70b-chat-hf",
)
model_name = "BAAI/bge-large-en"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': False}
embed_model = HuggingFaceBgeEmbeddings(
model_name=model_name,
model_kwargs=model_kwargs,
encode_kwargs=encode_kwargs
)
service_context=ServiceContext.from_defaults(
chunk_size=1000,
chunk_overlap=100,
embed_model=embed_model,
llm=llm
)
set_global_service_context(service_context)
if not os.path.exists(PERSIST_DIR):
reader = SimpleDirectoryReader(input_dir="./data", recursive=True)
Settings.llm = llm
Settings.embed_model = embed_model
docs = reader.load_data()
index = VectorStoreIndex.from_documents(documents=docs, service_context=service_context)
index.storage_context.persist(persist_dir=PERSIST_DIR)
st.write("LoadEmbedding>>>", index)
return index
else:
load the existing index
Settings.llm = llm
Settings.embed_model = embed_model
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_index_from_storage(storage_context)
st.write("StoredEmbedding>>>", index)
return index
index = load_data()
def generate_text(prompt):
question = ("tell me a story with a lesson?")
qa_prompt_str = (
"Context information is below.\n"
"---------------------\n"
"{context_str}\n"
"---------------------\n"
"Given only the context information and not prior knowledge, "
"answer the question: {query_str}\n"
)
refine_prompt_str = (
"We have the opportunity to refine the original answer "
"(only if needed) with some more context below.\n"
"------------\n"
"{context_msg}\n"
"------------\n"
"Given the new context, refine the original answer to better "
"answer the question: {query_str}. "
"If the context isn't useful, output the original answer again.\n"
"Original Answer: {existing_answer}"
)
chat_text_qa_msgs = [
ChatMessage(
role=MessageRole.SYSTEM,
content=(
prompt_template
),
),
ChatMessage(
role=MessageRole.USER,
content=(
qa_prompt_str
),
),
]
text_qa_template = ChatPromptTemplate(chat_text_qa_msgs)
Refine Prompt
chat_refine_msgs = [
ChatMessage(
role=MessageRole.SYSTEM,
content=(
"If the context isn't helpful, just say I don't know. Don't any add informtion into the answer that is not available in the context"
),
),
ChatMessage(
role=MessageRole.USER,
content=(
"New Context: {context_msg}\n"
"Query: {query_str}\n"
"Original Answer: {existing_answer}\n"
"New Answer: "
),
),
]
refine_template = ChatPromptTemplate(chat_refine_msgs)
custom_prompt = PromptTemplate(
"""
Given a conversation (between Human and Assistant) and a follow up message from Human,
rewrite the message to be a standalone question that captures all relevant context
from the conversation.
{chat_history} {question} """ )
list of ChatMessage objects
custom_chat_history = [
ChatMessage(
role=MessageRole.USER,
content="Hello assistant, we are having an insightful discussion about the given content and you are helping me understand the content by answering or summerizing and explaining me the content without changing its true meaning.",
),
ChatMessage(role=MessageRole.ASSISTANT, content="Okay, sounds good."),
]
if "chat_engine" not in st.session_state.keys(): # Initialize the chat engine
if prompt := st.chat_input("Your question"): # Prompt for user input and save to chat history
st.session_state.messages.append({"role": "user", "content": prompt})
for message in st.session_state.messages: # Display the prior chat messages
with st.chat_message(message["role"]):
st.write(message["content"])
If last message is not from assistant, generate a new response
if st.session_state.messages[-1]["role"] != "assistant":
with st.chat_message("assistant"):
with st.spinner("Thinking..."):
response = st.session_state.chat_engine.chat(prompt)
response = st.session_state.chat_engine.stream_chat(prompt,chat_engine.chat_history)
st.write(response.response)
message = {"role": "assistant", "content": response.response}
st.session_state.messages.append(message) # Add response to message history`
I am now sharing some of the examples:-
here i asked about Mumbai and a story knowledge about both of them is given by me in the form of pdf.
image
as you can see it answered both the 1 and 2 question correctly. But the for the 3rd question it was querring a wrong question all together. It was querring with this:-
image
and i don't know why it is spamming the queries as if they are in some kind of a loop.
I know there are lot of flaws in this code but please understand i am new to AI/ML so pls kindly help!!!
I think the project is near completion just few adjustment is needed just confused of what should i do your help would be much appreciated.
Beta Was this translation helpful? Give feedback.
All reactions