Skip to content

Latest commit

 

History

History
82 lines (60 loc) · 3.97 KB

README.md

File metadata and controls

82 lines (60 loc) · 3.97 KB

Introduction

PARSEC (imPutAtion for spaRSE sequenCing) is a bioinformatics pipeline designed to genotype large populations using low coverage sequencing data. It relies on bcftools mpileup to detect SNP sites and stitch to impute genotypes.

The pipeline is still in early development

metro map

  1. Index bams (SAMtools)
  2. Prepare fixed size genomic chunks (bedtools)
  3. Optionnal : call variants from sparse data
    1. Merge bams on each window (SAMtools)
    2. Call variants for each window (bcftools)
    3. Concatenate vcf files (bcftools)
    4. Sort vcf (bcftools)
    5. Filter variants (bcftools)
  4. Impute genotypes (stitch
  5. Index vcf (Tabix)
  6. Concatenate vcf files (bcftools)
  7. Sort vcf (bcftools)

Usage

Note If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

This pipeline uses code and infrastructure developed and maintained by the nf-core community, reused here under the MIT license.

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.