FlexTaxD is a versatile tool for the customization and integration of taxonomic classifications from diverse sources. It facilitates the creation, modification, and export of taxonomy databases for bioinformatics applications.
- Supported Taxonomy Formats: NCBI, TSV, MP-style (GTDB, QIIME, SILVA, CanSNPer).
- Database Build Programs: Compatible with kraken2, ganon, krakenuniq, and centrifuge.
- Database Customization: Modify, annotate, and clean up taxonomy databases to fit specific research needs.
- Output: Export databases to NCBI formatted files or tab-separated values, with customizable options.
- Data Management: Utilizes a SQLite database to manage data efficiently.
For a complete walkthrough, refer to the FlexTaxD Wiki.
# With conda using mamba
conda install mamba -n base -c conda-forge
mamba create -c conda-forge -c bioconda -n flextaxd flextaxd
# Manual Python installation
git clone https://github.com/FOI-Bioinformatics/flextaxd
cd flextaxd
pip install .
# Create a custom taxonomy database
flextaxd --taxonomy_file taxonomy.tsv --database .ftd
# Export database to NCBI format
flextaxd --dump
# Print database statistics
flextaxd --stats
# Get help
flextaxd --help
- Python >=3.6
- Additional dependencies vary based on executed functions (e.g.,
ncbi-genome-download
for genome downloads).
- biopython: Required for Newick visualizations.
- matplotlib: For tree visualizations.
- inquirer: For interactive prompts when multiple parents are present.
- kraken2, krakenuniq, ganon, centrifuge: Required if
create_database
is used.
Your contributions are welcome! Please refer to the Contribution Guide for details on how to submit pull requests, report issues, or request features.
FlexTaxD is open-sourced under the MIT license.
FlexTaxD is published in Bioinformatics.
Sundell, D. et al. (2021) ‘FlexTaxD: flexible modification of taxonomy databases for improved sequence classification’, Bioinformatics. Edited by J. Kelso. Bioinformatics. doi: 10.1093/bioinformatics/btab621.