The purpose of this proof of concept is to assess what is required to create a versatile, reliable and intuitive chatbot for users to engage on Alkemio related topics. The project is not deployable as is, but should serve as valuable input for showing generative AI capabilities and help assessing what is required to embed this functionality in the platform.
Large Language Models (LLMs), have significantly improved over the recent period and are not ubiquitous and performant. This opens a lot of possibilities for their usage in different areas. OpenAI is the best known commercial provider of LLMs, but there are ample choices for LLM models, either commercial or open source. Whilst this provides options, it also creates the risk of provider lock-in.
LLMs are just one component required for the practical implementation off generative AI solutions, and many other 'building blocks' are necessary too. Langchain is a popular open source library that provides these building blocks and creates an abstraction layer, creating provider independance.
Training a LLM is prohibitatively expensive for most organisations, but for most practical implementations there is a need to incorporate organisation specific data. A common approach is to add specific context to a user question to the prompt that is submitted to the LLM. This poses a challenge, as LLMs generally only allow prompts with a finite size (typically around 4k tokens). Therefore it is important that the relevant contextual information is provided and for the that following needs to be done:
- Data Collection
- Creating Text Embeddings
- Prompt Engineering
- Creating the Chat Interface
This project has been inspired by many articles, but theoretical and practical. A significant part of the code base comes from the Building an AWS Well-Architected Chatbot with LangChain project.
The projects has been implemented as a container based micro-service with a RabbitMQ RPC. There is one RabbitMQ queue:
alkemio-virtual-contributor-engine-guidance
- queue for submitting requests to the microservice
The request payload consists of json with the following structure (example for a query):
{
"data": {
"userId": "userID",
"question": "What are the key Alkemio concepts?",
"language": "UK"
},
"pattern": {
"cmd": "query"
}
}
The operation types are:
ingest
: data collection from the Alkemio foundation website (through the Github source) and embedding using the OpenAI Ada text model, no addition request data.reset
: reset the chat history for the ongoing chat, needs userId.query
: post the next question in a chat sequence, see exmaple
The response is published in an auto-generated, exclusive, unnamed queue.
There is a draft implementation for the interaction language of the model (this needs significant improvement). If no language code is specified, English will be assumed. Choices are: 'EN': 'English', 'US': 'English', 'UK': 'English', 'FR': 'French', 'DE': 'German', 'ES': 'Spanish', 'NL': 'Dutch', 'BG': 'Bulgarian', 'UA': "Ukranian"
*note: there is an earlier (outdated) RESTful implementation available at https://github.com/alkem-io/virtual-contributor-engine-guidance/tree/http-api
The following command can be used to build the container from the Docker CLI (default architecture is amd64, so --build-arg ARCHITECTURE=arm64
for amd64 builds):
docker build --build-arg ARCHITECTURE=arm64 --no-cache -t alkemio/virtual-contributor-engine-guidance:v0.4.0 .
docker build --no-cache -t alkemio/virtual-contributor-engine-guidance:v0.2.0 .
The Dockerfile has some self-explanatory configuration arguments.
The following command can be used to start the container from the Docker CLI:
docker run --name virtual-contributor-engine-guidance -v /dev/shm:/dev/shm --env-file .env virtual-contributor-engine-guidance
where .env
based on .azure-template.env
Alternatively use docker-compose up -d
.
with:
AZURE_OPENAI_API_KEY
: a valid OpenAI API keyOPENAI_API_VERSION
: a valid Azure OpenAI version. At the moment of writing, latest is2023-05-15
AZURE_OPENAI_ENDPOINT
: a valid Azure OpenAI base URL, e.g.https://{your-azure-resource-name}.openai.azure.com/
RABBITMQ_HOST
: the RabbitMQ host nameRABBITMQ_USER
: the RabbitMQ userRABBITMQ_PASSWORD
: the RabbitMQ passwordAI_MODEL_TEMPERATURE
: thetemperature
of the model, use value between 0 and 1. 1 means more randomized answer, closer to 0 - a stricter oneLLM_DEPLOYMENT_NAME
: the AI gpt model deployment name in AzureEMBEDDINGS_DEPLOYMENT_NAME
: the AI embeddings model deployment name in AzureAI_SOURCE_WEBSITE
: the URL of the foundation website that contains the source data (for references only)AI_SOURCE_WEBSITE2
: the URL of the welcome website that contains the source data (for references only)AI_LOCAL_PATH
: local file path for storing dataAI_WEBSITE_REPO
: url of the Git repository containing the foundation website source data, based on Hugo - without httpsAI_WEBSITE_REPO2
: url of the Git repository containing the welcome website source data, based on Hugo - without httpsAI_GITHUB_USER
: Github user used for cloning website reposAI_GITHUB_PAT
: Personal access token for cloning website reposLANGCHAIN_TRACING_V2
: enable Langchain tracingLANGCHAIN_ENDPOINT
: Langchain tracing endpoint (e.g. "https://api.smith.langchain.com")LANGCHAIN_API_KEY
: Langchain tracing API keyLANGCHAIN_PROJECT
: Langchain tracing project name (e.g. "virtual-contributor-engine-guidance")
You can find sample values in .azure-template.env
. Configure them and create .env
file with the updated settings.
The project requires Python & Poetry installed. The minimum version dependencies can be found at pyproject.toml
.
After installing Python & Poetry:
- Install the dependencies:
poetry install
- Run using
poetry run python virtual_contributor_engine_guidance.py
The project requires Python 3.11 as a minimum and needs Go and Hugo installed for creating a local version of the website. See Go and Hugo documentation for installation instructions (only when running outside container)
The following tasks are still outstanding:
- clean up code and add more comments.
- improve interaction language.
- assess overall quality and performance of the model and make improvements as and when required.
- assess the need to summarize the chat history to avoid exceeding the prompt token limit.
- update the yaml manifest.
- perform extensive testing, in particular in multi-user scenarios.
- look at improvements of the ingestion. As a minimum the service engine should not consume queries whilst the ingestion is ongoing, as that will lead to errors.
- look at the possibility to implement reinforcement learning.