🐍 hydra-genetics/prealignment

Snakemake module containing processing steps that should be performed before sequence alignment

💬 Introduction

The module consists of alignment pre-processing steps, such as trimming and merging of .fastq-files. We strongly recommend trimming .fastq-files prior to alignment. To enable trimming the trimmer_software-stanza in the config.yaml may be set to the name of the trimming rule, e.g. fastp_pe, or None if trimming should be omitted. Input data should be specified via samples.tsv and units.tsv.

❗ Dependencies

In order to use this module, the following dependencies are required:

🎒 Preparations

Sample and unit data

Input data should be added to samples.tsv and units.tsv. The following information need to be added to these files:

Column Id	Description
`samples.tsv`
sample	unique sample/patient id, one per row
tumor_content	ratio of tumor cells to total cells
`units.tsv`
sample	same sample/patient id as in `samples.tsv`
type	data type identifier (one letter), can be one of Tumor, Normal, RNA
platform	type of sequencing platform, e.g. `NovaSeq`
machine	specific machine id, e.g. NovaSeq instruments have `@Axxxxx`
flowcell	identifer of flowcell used
lane	flowcell lane number
barcode	sequence library barcode/index, connect forward and reverse indices by `+`, e.g. `ATGC+ATGC`
fastq1/2	absolute path to forward and reverse reads
adapter	adapter sequences to be trimmed, separated by comma

✅ Testing

The workflow repository contains a small test dataset .tests/integration which can be run like so:

$ cd .tests/integration
$ snakemake -s ../../Snakefile -j1 --use-singularity

🚀 Usage

To use this module in your workflow, follow the description in the snakemake docs. Add the module to your Snakefile like so:

module prealignment:
    snakefile:
        github(
            "hydra-genetics/prealignment",
            path="workflow/Snakefile",
            tag="1.0.0",
        )
    config:
        config


use rule * from prealignment as prealignment_*

Output files

The following output files should be targeted via another rule:

File	Description
`prealignment/merged/{sample}_{type}_fastq1.fastq.gz`	Merged and possibly trimmed foward reads
`prealignment/merged/{sample}_{type}_fastq2.fastq.gz`	Merged and possibly trimmed reverse reads

Name		Name	Last commit message	Last commit date
Latest commit History 261 Commits
.github		.github
.tests/integration		.tests/integration
config		config
images		images
workflow		workflow
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
requirements.test.txt		requirements.test.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🐍 hydra-genetics/prealignment

Snakemake module containing processing steps that should be performed before sequence alignment

💬 Introduction

❗ Dependencies

🎒 Preparations

Sample and unit data

✅ Testing

🚀 Usage

Output files

🧑‍⚖️ Rule Graph

Trim and merge fastq

Only merge fastq

About

Releases

Packages

Languages

License

Genomic-Medicine-Linkoping/hydra-genetics_prealignment

Folders and files

Latest commit

History

Repository files navigation

🐍 hydra-genetics/prealignment

Snakemake module containing processing steps that should be performed before sequence alignment

💬 Introduction

❗ Dependencies

🎒 Preparations

Sample and unit data

✅ Testing

🚀 Usage

Output files

🧑‍⚖️ Rule Graph

Trim and merge fastq

Only merge fastq

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages