Skip to content

A computational software for identifying retrotransposons from pairwise alignment net data.

License

Notifications You must be signed in to change notification settings

junhong-huang/retroSeeker

Repository files navigation

retroSeeker

retroSeeker: a computational software for identifying retrotransposons from pairwise alignment data (.net).

Input data:

Pairwise alignment data (.net) can be download from UCSC: https://hgdownload.soe.ucsc.edu/downloads.html#human
or customized by our pipeline (make_pairwise_alignment_pipeline.pl).

Usage:

Usage: retroSeeker [options] --fa --fai --net

-f/--fa : genome file, fasta format [Required]
-F/--fai : fai file of genome, you can use samtools faidx to generate [Required]
-n/--net : Net file, Net Format [Required]
-b/--bed [file] : bed file, bed Format, scan the bed content instead of the net file [option]
-o/--output : main retroSeeker output file
-O/--output2 : output the gap and fill region (used to identified the retrogene) in the net file [option]
-v/--verbose : verbose information
-V/--version : retroSeeker version
-h/--help : help informations
-S/--self : self-chain net type. default is different normal-chain net
-x/--RNA5endUp : extend length of upStream of 5'-end of the RNA direction to search tsd [default=30]
-y/--RNA5endDown : extend length of dpStream of 3'-end of the RNA direction to search tsd [default=30]
-z/--RNA3endUp : extend length of upStream of 5'-end of the RNA direction to search tsd [default=30]
-w/--RNA3endDown : extend length of dpStream of 3'-end of the RNA direction to search tsd [default=30]
-m/--minMatchLen : minimum TSD length [default>=5]
-M/--minPolyaLen : minimum polya length [default>=5]
-P/--maxPolyaLen : maximum polya length [default<=50]
-g/--minGenebodyLen : minimum genebody length [default>=20]
-l/--minLen : minimum length for retrogene [default>=50]
-L/--maxLen : maximum length for retrogene [default<=100000]
-s/--score : minimum score for retrogene [default>=10]

Installation:

Download retroSeeker-1.0.tar.gz from https://github.com/junhong-huang/retroSeeker/releases ; unpack it, and make:
tar -xzvf retroSeeker-1.0.tar.gz
cd retroSeeker-1.0
make

System requirements:

Operating system: retroSeeker is designed to run on POSIX-compatible platforms, including UNIX, Linux and Mac OS/X. We have tested most extensively on Linux and MacOS/X because these are the machines we develop on.
Compiler: The source code is compiled with the C++ compiler g++. We test the code using the g++ compilers.

Run retroSeeker:

retroSeeker --fa hg38.fa --fai hg38.fa.fai --bed retroACA.demo.bed --minMatchLen 5 --minPolyaLen 5 --minLen 60 --maxLen 100000 --RNA5endUp 30 --RNA5endDown 30 --RNA3endUp 30 --RNA3endDown 30

Output:

#chrom chromStart chromEnd name score strand tsdScore tsdLength retroLength retroSeq tsd5Seq tsdPair tsd3Seq polyaStart polyaEnd polyaLength polyaScore polyaSeq
chr1 53770991 53771163 retroSeeker_1 27 - 15 10 172 GCAAGGAGAAGGGCATACCCGTAGACCTTGCCTGACTGTGCTCATGTCCAGGCAGGGGGGACATTGTATTCGAGATTAATTTGAAGTTCCTGCCAGCTTTATCCAGCTTAATCAGTGGCTGGATAAATAGCAGGACTGTAACATTCCCCTGGGGGAAAAAAGGCAAGAAGAA GCAAGGAGAA |||||.|||| GCAAGAAGAA 156 161 6 12 AAAAAA

How to cite:

Huang J et.al, RetroSeeker Reveals Biogenesis, Expression, and Evolution of a Large Set of Novel Retrotransposons, Advanced Biotechnology, 2023

Contact :

Jun-Hong Huang ([email protected]) RNA Information Center, State Key Laboratory for Biocontrol, Sun Yat-sen University, Guangzhou 510275, P. R. China

About

A computational software for identifying retrotransposons from pairwise alignment net data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published