This repository provides relavant bioinformatics methods and codes for the manuscript: Characterization of emerging swine viral diseases through Oxford Nanopore sequencing using SVA as a model
Two sequencing methods, direct RNA sequencing and PCR-cDNA sequening were evaluated here. The objective of this study is to provide hands-on reference for viral infectious disease investigation, especially under emering situations.
Basecalling of raw reads was performed using Guppy (https://nanoporetech.com/) to generate FASTQ files.
Total yield, total reads, read quality, and read length from whole genome sequencing were analyzed using NanoPlot (https://github.com/wdecoster/NanoPlot).
In order to evaluate the raw error rates, raw reads (sva.fastq) were mapped to the SVA reference sequence (reference.fasta) using minimap2-2.12 (https://github.com/lh3/minimap2), processed with SAMtools (https://github.com/samtools/samtools) to generate BAM files, and then evaluated by AlignQC (https://github.com/jason-weirather/AlignQC).
minimap2 -ax map-ont reference.fasta sva.fastq > minimap_sva.sam
samtools view -b minimap_sva.sam > minimap_sva.bam
samtools sort minimap_sva.bam -o minimap_sva_sorted.bam
samtools index minimap_sva_sorted.bam
qualimap_v2.2.1/qualimap bamqc -bam minimap_prrsv_sorted.bam -outdir qualimap_results
alignqc analyze minimap_prrsv_sorted.bam -r reference.fasta --no_annotation -o prrsv_alignqc.xhtml
What's in my pot (WIMP) was used to classify sequencing reads. WIMP is available through EPI2MEAgent (https://nanoporetech.com/).
The input is raw fastq files from basecall, the output is species level taxonomic classification.
A custom SVA sequence database containing was generated by downloading all SVA whole genome sequences available in GenBank.
Alignment Search Tool (BLAST) was used to deterimine the viral strain by recording the top hit based on bit score.
cat sva.fastq | paste - - - - | awk -F '\t' '{L=length($2);if(L>M) {M=L;R=$0;}} END {print R;}' | tr "\t" "\n" > largest.fastq
seqtk-master/seqtk seq -a largest.fastq > largest.fasta
minimap2 -ax map-ont largest.fasta sva.fastq > sva_mapped.sam
racon sva.fastq sva_mapped.sam largest.fasta > racon.fasta
DRS: Software and version: Racon-1.3.2
cat sva.fastq | paste - - - - | awk -F '\t' '{L=length($2);if(L>M) {M=L;R=$0;}} END {print R;}' | tr "\t" "\n" > largest.fastq
seqtk-master/seqtk seq -a largest.fastq > largest.fasta
minimap2 -ax map-ont largest.fasta sva.fastq > sva_mapped.sam
racon sva.fastq sva_mapped.sam largest.fasta > racon.fasta
PCS: Software and version: Canu-1.6 (https://github.com/marbl/canu)
canu -p ssuisX -d ssuis__assembly genomeSize=2m -nanopore-raw ssuisX.fastq useGrid=0 # using MinION raw reads ssuisX.fastq to generate assembly ssuisX.contigs.fasta.