Input data

genomic FASTA file
- nucleotide sequence of the assembly divided in contigs/scaffolds
gene annotation as sorted GFF3 file following the format standards
proteins FASTA file (optional)
- protein sequences for all protein coding genes in the data set
  - headers must start with either the ID attribute of gene, mRNA or CDS feature in the GFF file
  - any text after first space or pipe will be ignored
  - a mapping file of protein to gene can be provided with the parameter 'prot2gene_mapper' (format: 'protein_headergene_id')
- will be extracted within taXaminer pipeline if not provided
coverage information as sorted BAM file (optional)
- multiple mapping files can be provided
config file
- YAML format

Provide feedback