yEvo Pipeline

Variant calling Snakemake pipeline for analyzing yEvo sequencing data.

Installation

Make sure you have conda installed.
Install Mamba to facilitate snakemake installation, as recommended in the Snakemake docs.

$ conda install -n base -c conda-forge mamba

$ git clone https://github.com/dunhamlab/yevo_pipeline.git

$ cd yevo_pipeline/ && mamba env create -f environment.yml

$ conda activate yevo_pipeline_env

$ ./scripts/download_test_data.sh

$ ./scripts/gen_run_script.sh

You're ready to run the pipeline!

After following the above installation instructions, run the pipeline on the provided test input files:

$ ./run_pipeline.sh

NOTE: be sure that you are in the repo's base directory with the yevo_pipeline_env conda environment activated.

To run this pipeline on your own sequencing data, configure runs using run_pipeline.sh:

FASTQ_DIR is the absolute path to the raw data (e.g. fastq.gz) directory
OUTPUT_DIR is the absolute path to your desired output directory, which Snakemake will create

Reference genome, ancestor, and annotation file paths are located in the config/config.yml file and can also be modified as needed.