PubTator and its 2.0 version (PubTator Central) uses text mining to tag PubMed abstracts/artciles with standardized concepts. This repository retrieves and processes PubTator annotations for use in greenelab/snorkeling
and elsewhere.
If you have arrived at this page in order to convert Pubtator into BioCXML format, you no longer need to. Pubtator Central now provides their own BioCXML files which can be found here.
- Install the conda environment.
- Create the pubtator environmenmt by running:
conda create --name pubtator python=3.8
- Install packages via pip by running the following:
pip install -r requirements.txt
- Activate with
conda activate pubtator
.
- Make sure you have python version 3.8 installed.
- Install packages by running the following:
pip install -r requirements.txt
To start processing Pubtator/Pubtator Central run the following command:
python execute.py --config config_files/pubtator_central_config.json
If the original Pubtator is desired replace pubtator_central_config.json
with pubtator_config.json
. The json file contains all the necessary parameters needed to run. More information for the json file can be found here.
This repository is dual licensed as BSD 3-Clause and CC0 1.0, meaning any repository content can be used under either license. This licensing arrangement ensures source code is available under an OSI-approved License, while non-code content — such as figures, data, and documentation — is maximally reusable under a public domain dedication.