-
Notifications
You must be signed in to change notification settings - Fork 3
Zippy modules
Wise, Aaron edited this page Feb 16, 2018
·
12 revisions
Here we list the current zippy modules and their file output formats. To use a module, type the name before 'Runner' as a stage into your proto.json. The names are case-insensitive.
Help on individual stages is now available directly from the command line. Try 'python zippy.py --help (stage)Runner
- Bcl2FastQRunner
- Outputs fastq
- BWARunner
- Outputs bam
- BWAAlignStatsRunner: Performs various picard metrics
- Outputs align_stats, insert_stats, insert_plot (pdf image)
- DataRunner: Loads into the pipeline external data from a specified directory.
- Outputs all data found in the specified directory. It assumes that sample names will be in the name of relevant files. It performs no workflow. For the DataRunner, you must define in the json file self.params.self.samples, a list of samples to load, or self.params.self.sample_sheet, a csv sample sheet that has information about what samples to load. For file types that need to be mapped to ZIPPY types, you can use self.params.self.optional.type_map, which is a map from raw file types to types that zippy expects. For example, for rsem files going to edger, rsem produces '.genes.results' files, but the proper ZIPPY type is 'rsem'. Thus, set type_map to {".genes.results":"rsem"}.
- RSEMRunner: RNA quantification
- Outputs rsem (a genes.results file)
- MACSRunner: Peak calling
- Currently returns no output (though it generates various peak-related files)
- MarkDuplicatesRunner: Runs picard's markduplicates function
- Outputs bam
- SubsampleBAMRunner: Runs samtools view subsampling
- Outputs bam
- BloomSubsampleBAMRunner: Runs subsampling using a bloom filter
- Outputs bam
- MergeBamRunner: Runs samtools merge. Currently many-to-one, but a many-to-fewer version is possible.
- Inputs: samples: which sample names to merge from incoming stages
- Outputs (one) bam
- IndelRealignment: Runs Hygea. Experimental!
- Outputs bam
- Pisces: Runs... uh... you got this one.
- Outputs vcf
- STAR:
- Takes the 'args' parameter for command line arguments to be passed to STAR
- Outputs bam
- Strelka:
- Runs the Strelka variant caller
- Inputs: bam
- Outputs: vcf
- FastQC
- Inputs: fastq
- Outputs: produces statistical output, but currently returns no output to ZIPPY