- Download DisTEMIST data (v5.0) from Zenodo (https://zenodo.org/record/6532684) and unzip content into
data/distemist
- Download trained models and dictionaries from Zenodo (https://zenodo.org/record/6642064) and extract into
dicts
andmodels
- Install Python dependencies:
pip install -r requirements.txt
python -m spacy download es_core_news_md
The Hydra config file ner_config.yaml
contains all hyperparameters to reproduce the NER training results.
To train the model on CUDA device n
, run:
CUDA_VISIBLE_DEVICES=<n> python scripts/run_ner_training.py
We performed a hyperparameter grid search over the parameters listed in ner_hyperparamter_sweep.sh
.