What is this? • Get Started • Contribute
This is a Python CLI toolkit to quickly create a chatbot with a web-based user interface. It has the following features:
- Automatic chunking and embedding (with RobBERT) for document retrieval.
- Scraping configurable URLs for knowledge.
- 8-Bit inference for generation of >15 tokens/sec.
- Configurable prompts with sensible defaults.
- high-quality generations with low VRAM usage thanks to Mistral-7B.
- Supports various multilingual (e.g. Mistral-7B) and Dutch models (e.g. GEITje-7b-chat).
- Built-in model-dependent templates for conversation.
- Web-UI with Gradio.
You need a GPU with at least 10.6 GB of VRAM.
First you need to activate a virtual environment and install the required dependencies (Pytorch, Huggingface, Gradio, Langchain, ...):
python -m venv .env
source .env/bin/activate
pip install -r requirements.txt
After that, the toolkit is ready to run, but there are no documents in the vector store yet. For that you need to run the toolkit (once) with the following flag to scrape the urls mentioned in sources.txt
.
python main.py --load-from-scratch
This only needs to be done once, unless you update the list of sources. The documents are stored in a vector database at the --vectors-db
location, the default folder is vectors/
.
Once the knowledge base is initialized, you can run the server at any time using the following command:
python main.py
This will open a Gradio http server on that machine, which you can access on the url that is printed in the terminal.
The demo of this chatbot toolkit is about the Belgian town Oudenaarde. You can easily change this by updating the topic --topic Oudenaarde
and updating the list of sources in sources.txt
. Make sure to run the --load-from-scratch
command once.
The --topic
flag will be used as part of a prompt, so keep in mind to make sure it fits the following sentence:
Je bent een expert in {topic}.
By default, we use Mistral-7B, but different models are also possible, for instance the Dutch GEITje model:
python main.py --model-name Rijgersberg/GEITje-7B-chat-v2
Note that some models will require a different prompt format. GEITje is a Mistral-7B derivative, so it uses the same [INST]
tokens.
The toolkit uses two models, RobBERT and Mistral-7B, which requires ~15 GB of free space. You can change the storage location using the Hugging Face home before running the toolkit.
export HF_HOME=/your/path
python main.py
The following command illustrates how to change the most important parameters:
python main.py
--load-from-scratch
--model-name 'mistralai/Mistral-7B-Instruct-v0.1'
--vectors-db vectors/
--chunk-size 1024 # Size of the chunks from the sources
--title 'OudenaardeGPT' # Shown in the web UI
--topic 'de Oost-Vlaamse stad Oudenaarde' # Prompt
Always welcome to contribute. Just open a pull request or an issue.
In particular the following features are welcome:
- Processing the scraped websites with an LLM to clean HTML artifacts.
- Summarizing the question that a user asks and using that for more accurate retrieval instead of embedding the question.
The demo is about Oudenaarde, but try to keep this part modular so users can change that easily.