Releases: FOI-Bioinformatics/flextaxd
FlexTaxD v0.3.2
Bug fixes and one additional feature
Requires biopython
- A possibility to visualize the tree structure from a node was implemented, use --visualize_tree prints newick to terminal window
Bug fixes and enhancements
- Import error in v0.3.1 fixed
- Updated ganon database build to deal with duplicated header error
- Also increased speed for building Ganon database (will require more diskspace during build) option to force the use of zipped files will be added later.
FlexTaxD v0.3.1
FlexTaxD v0.3.1
- New algoritm for cleaning non-annotated nodes from the database
- Error catch for incorrectly downloaded files, will output a warning plus genome id
- Additional info messages added
FlexTaxD v0.3.0
FlexTaxD v0.3.0
FlexTaxDatabase was implemented to simplify merging and customization of different database sources. For example using the NCBI database as a source but benefit from the restructured kingdom of Bacteria and or Archaea from GTDB.
- Added dependency: ncbi-genome-download will be required to download additional genomes not found in input path
- Download and ProcessDirectory functions rewritten and placed in separate classes (making them independend of BuildKrakenDB)
- Bug limiting naming of QIIME formatted input databases ( *_ was removed) now only GB_ and RS_ is removed (leading chars in GTDB default source file)
- Ganon build function restored
- Wiki added with guidelines and examples
FlexTaxD v0.2.4
FlexTaxD v0.2.4
FlexTaxDatabase was implemented to simplify merging and customization of different database sources. For example using the NCBI database as a source but benefit from the restructured kingdom of Bacteria from GTDB.
- Bugfixes - several bugs related to merging of databases adressed
- QIIME imports
- All Superkingdoms added as default nodes (NCBI structure)
- Option to add one extra level (pre-defined as x__) to QIIME formated databases to allow for example strains to be defined
- Validation function implemented - will be run automaticall when merging databases recommended to run when database is completed
FlexTaxD v0.2.1
FlexTaxD v0.2.1
FlexTaxDatabase was implemented to simplify merging and customization of different database sources. For example using the NCBI database as a source but benefit from the restructured kingdom of Bacteria from GTDB.
FlexTaxD v0.2.0
FlexTaxD is built to handle import and modification of different taxonomy sources
v0.1.2 and later (NCBI, QIIME(GTDB), CanSNPer)
The custom_taxonomy_databases script contains a parser and a database handler to allow customization of databases
from NCBI, QIIME or CanSNPer sources and supports export functions into NCBI formatted names and nodes.dmp files
as well as a slimmed tab separated file. The database allows databases to be merged at selected
nodes(taxonomy IDs) as well as adding resolution to certain subgroups (ie using a tab separated file).
The script was initially written to allow the use of GTDB with some custom modifications to allow separation of
subgroups. GTDB was created by an Australian group aimed to restructure the taxonomy relation from the NCBI
taxonomy tree to strictly follow a phylogenetic structure (http://gtdb.ecogenomic.org/) this script can use the
taxonomy.tsv files from the GTDB downloads page as input (with the --taxonomy_type selected as QIIME). By default
the script will read a Tab separated file containing parent and child (defined by column headers).
All data is kept in a sqlite3 database (.ctdb by default) and can be dumped at will to NCBI formatted names
and nodes.dmp files. Supported export formats in version 0.2b is NCBI and TSV). The TSV dump format is similar to
the NCBI dump except that it contains a header (parent/child), has parent on the left and only uses tab to separate
each column (not |).
FlexTaxD v0.1.2
FlexTaxD is built to handle import and modification of different taxonomy sources
v0.1.2 (NCBI, QIIME(GTDB), CanSNPer)
The custom_taxonomy_databases script contains a parser and a database handler to allow customization of databases
from NCBI, QIIME or CanSNPer sources and supports export functions into NCBI formatted names and nodes.dmp files
as well as a slimmed tab separated file. The database allows databases to be merged at selected
nodes(taxonomy IDs) as well as adding resolution to certain subgroups (ie using a tab separated file).
The script was initially written to allow the use of GTDB with some custom modifications to allow separation of
subgroups. GTDB was created by an Australian group aimed to restructure the taxonomy relation from the NCBI
taxonomy tree to strictly follow a phylogenetic structure (http://gtdb.ecogenomic.org/) this script can use the
taxonomy.tsv files from the GTDB downloads page as input (with the --taxonomy_type selected as QIIME). By default
the script will read a Tab separated file containing parent and child (defined by column headers).
All data is kept in a sqlite3 database (.ctdb by default) and can be dumped at will to NCBI formatted names
and nodes.dmp files. Supported export formats in version 0.2b is NCBI and TSV). The TSV dump format is similar to
the NCBI dump except that it contains a header (parent/child), has parent on the left and only uses tab to separate
each column (not |).