This module was created to get CORD-19 papers synchronized in MongoDB and Elasticsearch and get its metadata converted to various bibliographic standards (MARC21, etc).
pip3 install -r requirements.txt
- Download the original collection from Kaggle or directly from Ai2
- unzip archive in some folder on your hard drive, for example, /corddata
- edit api/config.py and change "maindir" to your folder, "cordversion" to reflect the current CORD-19 version from Kaggle (v38 at the moment)
mongo admin
db.createUser({user: "coronawhyguest" , pwd: "coro901na", roles: [ "readWriteAnyDatabase" ]});
python3 ./start.py
Login to Mongo and check imported CORD-19 metadata records
mongo -u coronawhyguest -p coro901na cord19
db.v38.find().count()