Faroese question-answering dataset, generated by GPT-4.
Developer(s):
- Dan Saattrup Nielsen ([email protected])
- Run
make install
, which sets up a virtual environment and all Python dependencies therein. - Run
source .venv/bin/activate
to activate the virtual environment. - Run
echo "OPENAI_API_KEY=<your-openai-api-key> > .env
to enable OpenAI generation. - Run
python src/scripts/create_dataset.py
to create the dataset.
The raw dataset will be stored in data/raw
and will be updated continuously during
creation, and the final dataset will appear in your data/final
.
You can also run the Dockerfile
directly, which builds the dataset without having to
set up a Python environment.