This repository provides easy scripts to run PrimeQA applications via docker.
We use docker and docker-compose to run our application. Make sure you have the most up-to-date version of those tools.
OS: Ubuntu 20.04.4 LTS
Memory: 32GB (64GB - Recommended)
GPU: NVIDIA Corporation GV100GL [V100 PCIe 16GB]
NVIDIA Driver version: 470.141.03
Disk space: 50 GB is required for the docker, 25 GB of available free space is needed in the docker container storage
PrimeQA services now adds support for:
Rerankers
For more details:
Generative Readers
For more details on GenerativeReader
and PromptReader
:
PrimeQA services now adds support for BM25
and DPR
Retrievers.
The information.json
file in the index directory must include an engine_type
files set to one of BM25
, ColBERT
or DPR
.
If you have existing ColBERT indexes in primeqa-store/indexes
, please update the information.json
file in the index directory to include a configuration section as follows:
"configuration": {
"engine_type": ColBERT,
"checkpoint": <checkpoint-dir-name>
}
-
Set the environment variable
PUBLIC_IP
to the ip address of the localhost. This host must be reachable from where you will be accessing via the browser. Otherwise, please use VNC to access the host. If accessing the application via the browser locally,PUBLIC_IP
can be set tolocalhost
.``` export PUBLIC_IP=<hostname> ```
-
Please ensure that the following three ports are free and available:
50051
,50059
and82
-
Launch the container using
bash
incpu
(default) orgpu
mode:CPU mode (default):
launch.sh
GPU mode:
launch.sh -m gpu
🚨 Note: This process will take a while to complete as it will download necessary docker images and bring up services.
-
Run
docker ps
to verify that all the three containers (primeqa-ui, primqa-orchestrator and primeqa-service) are running. -
You will need to configure a few additional settings before first use. These setting are intentionally left blank for security purposes.
-
Settings are defined in the file
orchestrator-store/primeqa.json
. Create this file and copy-pase the Reader and Retriever setting that you would like to use from the examples belowa. To use the IBM® Watson Discovery retriever and PrimeQA reader, first configure a IBM® Watson Discovery Cloud instance using the instructions here and create a collection index.
{ "retrievers": { "Watson Discovery": { "service_endpoint": "<IBM® Watson Discovery Cloud/CP4D Instance Endpoint>", "service_api_key": "<API key (ONLY If using IBM® Watson Discovery Cloud instance)>", "service_project_id": "<IBM® Watson Discovery Project ID>" } }, "readers": { "PrimeQA": { "service_endpoint": "primeqa:50051", "beta": 0.7 } } }
b. To use the PrimeQA retriever and PrimeQA reader, first setup the collection index for the Retriever using the instructions here.
{ "retrievers": { "PrimeQA": { "service_endpoint": "primeqa:50051" } }, "readers": { "PrimeQA": { "service_endpoint": "primeqa:50051", "beta": 0.7 } } }
NOTE: The final scoring and ranking is done with a weighted sum of the Reader answer scores and Retriever search hits scores. The
beta
field is the weight assigned to the reader scores and1-beta
is the weight assigned to the retriever scores. -
Please allow 30 seconds for the primeqa-orchestrator to establish connectivity to IBM® Watson Discovery and PrimeQA service.
-
You can test the PrimeQA orchestrator's connectivity to your IBM® Watson Discovery (WD) instance by executing the [GET]
/retrievers/{retriever_id}/collections
endpoint.curl -X 'GET' "http://{$PUBLIC_IP}:50059/retrievers/WatsonDiscovery/collections" -H 'accept: application/json'
-
To see all available retrievers, execute [GET]
/retrievers
endpointcurl -X 'GET' "http://{$PUBLIC_IP}:50059/retrievers" -H 'accept: application/json'
-
To run a sample question answering query, execute [POST]
/ask
endpointa. Using the IBM® Watson Discovery Retriever (You must provide the name of your <collection_id>)
curl -X 'POST' "http://{$PUBLIC_IP}:50059/ask" -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "question": "<SAMPLE QUERY>", "retriever": { "retriever_id": "WatsonDiscovery" }, "collection": { "collection_id": "<collection_id> from collections returned by [GET]/collections API.", "name": "Name of corresponding collection" }, "reader": { "reader_id": "ExtractiveReader" } }'
b. Using the PrimeQA Retriever (You must provide the name of your <collection_id>)
curl -X 'POST' "http://{$PUBLIC_IP}:50059/ask" -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "question": "<SAMPLE QUERY>", "retriever": { "retriever_id": "ColBERTRetriever" }, "collection": { "collection_id": "<collection_id> from collections returned by [GET]/collections API.", "name": "Name of corresponding collection" }, "reader": { "reader_id": "ExtractiveReader" } }'
-
To run reading:
curl -X 'POST' \ "http://{$PUBLIC_IP}:50059/GetAnswersRequest" \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "question": "Where was Genghis Khan buried?", "contexts": [ "Before Genghis Khan died, he assigned Ögedei Khan as his successor and split his empire into khanates among his sons and grandsons. He died in 1227 after defeating the Western Xia. He was buried in an unmarked grave somewhere in Mongolia at an unknown location. His descendants extended the Mongol Empire across most of Eurasia by conquering or creating vassal states out of all of modern-day China, Korea, the Caucasus, Central Asia, and substantial portions of modern Eastern Europe, Russia, and Southwest Asia. Many of these invasions repeated the earlier large-scale slaughters of local populations. As a result, Genghis Khan and his empire have a fearsome reputation in local histories.." ], "reader": { "reader_id": "ExtractiveReader", "parameters": [ { "parameter_id": "max_num_answers", "value": 5 } ] } }'
Example Answer:
[ { "text": "Mongolia at an unknown location", "confidence_score": 1, "start_char_offset": 229, "end_char_offset": 260, "context_index": 0 } ]
You can now open a browser of your choice (Mozilla Firefox/Google Chrome) and visit "http://{PUBLIC_IP}:82" to interact with the PrimeQA application. You will see our Retrieval, Reader and QuestionAnswering components. Some features include the ability to adjust settings and for users to provide feedback on retrieved answers.
Users can provide feedback via the 👍 and 👎 icons to the answers shown in the results page.
To use the feedback to fine-tune your Reader model
- Get the feedback data:
curl -X 'GET' \
'http://localhost:50059/feedbacks?application=reading&application=qa&_format=primeqa' \
-H 'accept: application/json' > feedbacks.json
-
Follow the instructions on how to finetune a PrimeQA reader with custom data here. Generally, the finetuning would start with the model used when collecting the feedback data as specified in the
Model
field underReader
settings in theReading
and/orQuestionAnswering
UI. -
To deploy the finetuned model, follow the instructions here.
a. If the UI is not loading properly or a field is blank, please try these quick steps:
- clear the browser cache and retry
- restart the containers by running
terminate.sh
and thenlaunch.sh
b. To view the logs, use the docker logs command, for example:
```
docker logs primeqa-ui
docker logs primeqa-orchestrator
docker logs primeqa-services
```
-
How do I switch to a different PrimeQA Reader model from the Huggingface model hub ?
Paste the model name from the Huggingface model hub into the
Model
field underReader
settings in theReading
and/orQuestionAnswering
UI.IMPORTANT: Only models trained using PrimeQA are supported. Other models based on Huggingface QA model will not work.
-
How do I use my custom model for reader in
Reading
orQA
application?By default the reader initializes the
PrimeQA/nq_tydi_sq1-reader-xlmr_large-20221110
from the Huggingface model hub.To use your own reader model, place your model in a directory under
primeqa-store/models
directory. To point to your model from the UI, navigate toApplication Settings
, scroll down toReader Settings
and toModel
and set it to/store/model/<model-dir>
, replacemodel-dir
with the name of the directory containing the model files.The service will load the model and initialize a new reader. This may take a few minutes. Subsequent queries will use this model.
-
How do I use my ColBERT index and checkpoint ?
Please follow the instructions here
-
The Corpus field is blank in the 'Retriever' or 'Question Answering' page
See Troubleshooting