Skip to content
/ Tailor Public

Tailor is a fast short read aligner like bowtie/BWA/SOAP2 but is designed specifically to discovering trimming and tailing events for small silencing RNAs.

License

Notifications You must be signed in to change notification settings

jhhung/Tailor

Repository files navigation

Tailor

Tailor is a Burrows–Wheeler transform (BWT) based fast aligner specialized in discovering non-templated addition of nucleotide to the 3' end of nucleic acids (a.k.a., tailing) from Next Generation Sequencing data.

Tailor is released under GPLv2 with additional restriction so that is only applicable to individuals and non-profits and that any for-profit company must purchase a different license.

A Shell based pipelines are provided using fastq as input and produce publication quality figures.

INSTALL

*Only 64 bits systems are able to compile and run Tailor.

Run the binary directly without installation

Please try the precompiled binaries first, most of the linux systems should be able to run Tailor without any troubles.

bin/tailor_v.1.0_linux_static 		# for linux

Or you can find it in the release tab in this page or at this link.

Compile from the source code

Install the dependencies

  • 1.1 Relative recent C++ compiler that support most features of C++11. We recommend GCC.
  • 1.2 Boost
  • 1.3 CMake

Get the latest version of the software

git clone [email protected]:jhhung/Tailor.git

Enter the folder Tailor and:

  • Set environmental variable $BOOST_ROOT to the directory of boost if CMake cannot find boost automatically;
  • Set environmental variable $CC and $CXX to the gcc/g++ compiler you want to use.
cmake .

Compile the software by typing:

make

troubleshooting

  • If you got linker error, it is possible that the default library in the lib/ is not suitable to your platform. There is one library available for Linux, rename the one that fit the best to "libabwt_table.a", and retype
make

USAGE


tailor

Build genomic index (similar to bowtie-build)

Options:

-h [ --help ]         display this help message and exit
-i [ --input ] arg    The input fasta file.
-p [ --prefix ] arg   Prefix of index file to generate.
-f [ --force ]        Overwrite the existing index files if they already exist.

Example:

tailor build -i genome.fa -p genome

Mapping

-h [ --help ]                 display this help message and exit
-i [ --input ] arg            Input fastq file
-p [ --index ] arg            Prefix of the index
-o [ --output ] arg (=stdout) Output SAM file, stdout by default 
-n [ --thread ] arg (=1)      Number of thread to use; if the number is larger than the core available, it will be adjusted automatically
-l [ --minLen ] arg (=18)     minimal length of exact match (prefix match) allowed
-v [ --mismatch ]             to allow mismatch in the middle of the query

Examples:

tailor map -p genome -n 8 -i smallRNA.fq -o smallRNA.sam         # no mismatch
tailor map -p genome -n 8 -i smallRNA.fq -o smallRNA.sam   -v    # allow one mismatch

tailing pipeline

# reads.fq: input reads in fastq format: 
# dm3.fa: genome sequence in fasta format; it will generate index in the index folder of tailor directory, if it doesn't exist
# genomic_feature_file; used to generate figures for different genomic features (exon, intron...). See our example in the annotation folder to make such file
# using 24 CPUs
# use PHRED score 20 as filter: only reads with every base equal to to higher than 20 pass the filter and enters the pipeline
run_miRNA_tailing_pipeline.sh \ 
	-i reads.fq  \ 
	-g dm3.fa \ 
	-t genomic_feature_file \ 
	-o output_dir \ 
	-c 24 \ 
	-q 20

Download


Indexes

  • Tailor indexes
# You can find all the pre-bulit indexes in:
http://www.jhhlab.tw/Tailor/index/

# Human:
http://www.jhhlab.tw/Tailor/index/hg18.tar.gz
http://www.jhhlab.tw/Tailor/index/hg19.tar.gz

# Mouse:
http://www.jhhlab.tw/Tailor/index/mm9.tar.gz
http://www.jhhlab.tw/Tailor/index/mm10.tar.gz

# for downloading
lftp -c "pget -n 4 http://www.jhhlab.tw/Tailor/index/hg18.tar.gz"

Speed test files for the publication

  • You can download all the related files for the speed test from the link
http://www.jhhlab.tw/Tailor/speed_test_samples/
  • And the links of original data for non-tailed and tailed reads
http://www.jhhlab.tw/Tailor/speed_test_samples/Drosophila_melanogaster.2m.fq
http://www.jhhlab.tw/Tailor/speed_test_samples/Drosophila_melanogaster.all.randomeTailed.fq
  • And the speed test log (3 times)
http://www.jhhlab.tw/Tailor/speed_test_samples/test_speed.log
http://www.jhhlab.tw/Tailor/speed_test_samples/test_speed.log2
http://www.jhhlab.tw/Tailor/speed_test_samples/test_speed.log3
  • And the speed log for bowtie tailing
http://www.jhhlab.tw/Tailor/speed_test_samples/tailing.log
http://www.jhhlab.tw/Tailor/speed_test_samples/tailing.log2
http://www.jhhlab.tw/Tailor/speed_test_samples/tailing.log3
  • All scripts for speed test can be found in the utils directory

Citing Tailor

to cite annotations included in Tailor

Contact

    Jui-Hung Hung <juihunghung `at` gmail.com>
    Bo W Han <bowhan `at` me.com>
    Chiung-Po Hsiao <restart0216s `at` gmail.com>

About

Tailor is a fast short read aligner like bowtie/BWA/SOAP2 but is designed specifically to discovering trimming and tailing events for small silencing RNAs.

Resources

License

Stars

Watchers

Forks

Packages

No packages published