This document describes the pipeline used for the bioinformatic analysis of mitochondrial
DNA (mtDNA) sequence data. The pipeline involves processing long-read sequencing data
obtained from Oxford Nanopore Technologies. The core tools used in this pipeline are
minimap2
, samtools
, and bcftools
.
The pipeline processes raw sequencing data (in FASTQ format) to ultimately produce filtered variant call files (VCF). Below are the steps involved:
Aligns the raw sequencing reads to a reference mtDNA sequence.
minimap2 -ax map-ont <MTDNA_Reference.fna> <Sample.fastq> >
<Output_file_name.sam>
Converts the SAM file generated by minimap2 to a binary format (BAM).
samtools view -Sb <input.sam> > <output.bam>
Sorts the BAM file by genomic coordinates.
samtools sort <input.bam> -o <sorted_output.bam>
Creates an index for the sorted BAM file.
samtools index <sorted_output.bam>
Performs variant calling on the aligned and sorted reads.
bcftools mpileup -q20 -Ou -f <MTDNA_Reference.fa>
<sorted_output.bam> | bcftools call -cv --ploidy 1 -f GQ -Ou |
bcftools filter -i 'QUAL>20' -Ov -o <filtered_results.vcf>
(Filtering by Base Quality)
bcftools mpileup -Q20 -q20 -Ou -f <MTDNA_Reference.fa>
<sorted_output.bam> | bcftools call -cv --ploidy 1 -f GQ -Ou |
bcftools filter -i 'QUAL>20' -Ov -o <filtered_results.vcf>
dorado basecaller <kitname> <basecalling_model> /path/to/pod5_pass > /path/to/desired_ouput_directory/output.fastq
- Input: FASTQ file from Oxford Nanopore sequencing.
- Intermediate: SAM/BAM files for aligned reads.
- Output: VCF file containing the filtered list of variants.
minimap2
for read alignment.samtools
for manipulation of SAM/BAM files.bcftools
for variant calling and filtering.
- Ensure that the reference mtDNA file (
MTDNA_Reference.fna
/.fa
) is accurate and up-to- date. - Quality thresholds and other parameters in
bcftools
commands can be adjusted based on specific project requirements. - For large datasets, consider increasing system resources to improve processing times.
For questions or issues related to this pipeline, please contact [Ahmed Khalid/[email protected]].