Skip to content

Latest commit

 

History

History
34 lines (22 loc) · 1.64 KB

README.md

File metadata and controls

34 lines (22 loc) · 1.64 KB

FoQA

Faroese question-answering dataset, generated by GPT-4.


Code Coverage Documentation License LastCommit Contributor Covenant

Developer(s):

Quickstart

  1. Run make install, which sets up a virtual environment and all Python dependencies therein.
  2. Run source .venv/bin/activate to activate the virtual environment.
  3. Run echo "OPENAI_API_KEY=<your-openai-api-key> > .env to enable OpenAI generation.
  4. Run python src/scripts/create_dataset.py to create the dataset.

The raw dataset will be stored in data/raw and will be updated continuously during creation, and the final dataset will appear in your data/final.

Docker

You can also run the Dockerfile directly, which builds the dataset without having to set up a Python environment.