This project contains two sets of code:

Firstly, parser for the accession format described here: https://www.ncbi.nlm.nih.gov/Sequin/acc.html called scrape_accession_rules.py and the RefSeq sections described in https://www.ncbi.nlm.nih.gov/books/NBK21091/table/ch18.T.refseq_accession_numbers_and_mole/?report=objectonly and the SRA accessions described in https://www.ncbi.nlm.nih.gov/books/NBK56913/#search.what_do_the_different_sra_accessi. This can be run as

scrape_accession_rules.py accession_rules.json

to scrape the rules at that website and save them in JSON format. The file generated by this parser is used in the second script, parse_accession.py which can be used as a Python module but also as a command line tool e.g. parse_accession.py accession_rules.json AE014297.

The parse_accession.py script aims to work with all Python versions. The scrape_accession_rules.py script requires at least Python 3.6 and the modules specified in conda_requirements.txt.

This code is licensed under the BSD 3 clause license. Please feel free to use and modify the code, but it is released without warranty. See the .py files for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Files

README.md

Latest commit

History

README.md

File metadata and controls