GenomeAnnotation

This project aims to create an easy way to annotate genomes.

Personally suggested that input transcripts from Hiast2+TransDecoder(cufflinks_gtf_to_alignment_gff3.pl) to PASA

###########update#################

V2.sh: More simplicity, faster, and no frameshift

RNA-based: Hisat2+TransDdecoder

Ab-initio: braker3

Homology: Complete structure from Miniprot

########### Post PASA ############

###Soft-masked genome would result in TE contained in annotation. We used OrthoFinder to filter annotation results to retain orthologous genes and remove non-orthologous with 1 or 2 exons.

##Target species (Gene ID like NNYC0000010.1 )in Orthogroups.GeneCount.tsv $3 with reference species in $2 and $4

orthofinder -f orthof -og -M msa -t 12 -S blast_gz

cd orthof/*/Orthogroups/

cat Orthogroups.GeneCount.tsv | awk '{if ($2 > 0 || $4 >0) print}' | awk '{if ($3 > 0) print}' > nny.allortho.count.tsv

awk 'FNR==NR {a[$1]=$0;next} $1 in a {print a[$1],$0}' nny.allortho.count.tsv Orthogroups.tsv > nny.merge.tsv

grep -o nny.merge.tsv | cut -f1 -d "." > nny.orthogene.list

#nny.orthogene.list contained all orthologous genes that should kept. Non-orthologous with 1 or 2 exons can be found via TBTools GXF STAT or any other method you prefer.

######################################################

RNA-seq+homology+ab-initio based annotation.sh used in the great bustard genome.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
README.md		README.md
RNA-seq+homology+ab-initio based annotation.sh		RNA-seq+homology+ab-initio based annotation.sh
V2.sh		V2.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GenomeAnnotation

About

Releases

Packages

Languages

hrluo93/GenomeAnnotation

Folders and files

Latest commit

History

Repository files navigation

GenomeAnnotation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages