Please see the Wiki page for introduction and tutorial on how to use this tool.
Garber AI, Nealson KH, Okamoto A, McAllister SM, Chan CS, Barco RA and Merino N (2020) FeGenie: A Comprehensive Tool for the Identification of Iron Genes and Iron Gene Neighborhoods in Genome and Metagenome Assemblies. Front. Microbiol. 11:37. doi: 10.3389/fmicb.2020.00037
Special thanks to Michael Lee for helping to put together the Conda environment for FeGenie. Thanks to Natasha Pavlovikj for creating the Conda recipe for FeGenie. Thanks to Michał Sitko for creating a Dockerfile for FeGenie.
conda create -n fegenie -c conda-forge -c bioconda -c defaults fegenie=1.0 --yes
conda activate fegenie
FeGenie.py -h
and when you are done using FeGenie and would like to deactivate the Conda environment for FeGenie
conda deactivate
git clone https://github.com/Arkadiy-Garber/FeGenie.git
cd FeGenie
bash setup.sh
./FeGenie.py -h
FeGenie.py -bin_dir /directory/of/bins/ -bin_ext fasta -t 16
The argument for -bin_ext needs to represent the filename extension of the FASTA files in the selected directory that you would like analyzed (e.g. fa, fasta, fna, etc).
./FeGenie.py -bin_dir /directory/of/bins/ -bin_ext fasta -t 16 -out output_fegenie
hmms/iron
directory can be found within FeGenie's main repository
-t 8 means that 8 threads will be used for HMMER and BLAST. If you have less than 16 available on your system, set this number lower (default = 1)
FeGenie introductory slideshow:
FeGenie video tutorial:
To start the tutorial, hit the 'launch binder' button below, and follow the commands in 'Walkthrough'
(Initially forked from here. Thank you to the awesome binder team!)
Enter the main FeGenie directory
cd FeGenie
print the FeGenie help menu
FeGenie -h
run FeGenie on test dataset
FeGenie.py -bin_dir genomes/ -bin_ext fna -out fegenie_out
Go into the output directory and check out the output files
cd fegenie_out
less FeGenie-geneSummary-clusters.csv
run FeGenie on gene calls
FeGenie.py -bin_dir ORFs/ -bin_ext faa -out fegenie_out --orfs
run FeGenie on gene calls, and use reference database (RefSeq sub-sample) for cross-validation
FeGenie.py -bin_dir ORFs/ -bin_ext faa -out fegenie_out --orfs -ref refseq_db/refseq_nr.sample.faa
In case of running FeGenie
with docker the only dependency you need to have installed is docker itself (installation guide).
With docker installed you can run FeGenie
in the following way:
docker run -it -v $(pwd):/data --env iron_hmms=/data/hmms/iron --env rscripts=/data/rscripts note/fegenie-deps ./FeGenie.py -bin_dir /data/test_dataset -bin_ext txt -out fegenie_out -t $(nproc)
./FeGenie.py ...
follows normal, non-dockerized flow of arguments.
Beware that you need to mount directories which contain files FeGenie
is supposed to read. If you are not familiar with docker then run docker run
command from the directory into which you cloned FeGenie
repository. If all the files you pass to FeGenie
are in inside this directory and you use relative filepaths (like e.g. hmms/iron
) everything will work just fine.
- Ability to accept previously-annotated genomes and gene-calls.
- Include Cytochrome 579 (and possible rusticyanin)
- Improve dilineation between MtrA and MtoA for better resolution with respect to identification of iron reduction and iron oxidation, respectively.
- Option to report absolute values for gene counts (rather than normalized gene counts)
- Include option to release all results (regardless of whether rules for reporting were met)
- Identification of iron-sulfur proteins.