🐍 hydra-genetics/alignment

Snakemake module containing processing steps that should be performed during sequence alignment.

💬 Introduction

The module consists of alignment processing steps, such as alignment of .fastq-files. and duplicates marking .bam-files.

❗ Dependencies

In order to use this module, the following dependencies are required:

🎒 Preparations

Sample and unit data

Input data should be added to samples.tsv and units.tsv. The following information need to be added to these files:

Column Id	Description
`samples.tsv`
sample	unique sample/patient id, one per row
`units.tsv`
sample	same sample/patient id as in `samples.tsv`
type	data type identifier (one letter), can be one of Tumor, Normal, RNA
platform	type of sequencing platform, e.g. `NovaSeq`
machine	specific machine id, e.g. NovaSeq instruments have `@Axxxxx`
flowcell	identifer of flowcell used
lane	flowcell lane number
barcode	sequence library barcode/index, connect forward and reverse indices by `+`, e.g. `ATGC+ATGC`
fastq1/2	absolute path to forward and reverse reads
adapter	adapter sequences to be trimmed, separated by comma

Reference data

You need have a indexed reference genome: ex reference.fna

For bwa the files are generated by bwa index. Dict files is generated using picard CreateSequenceDictionary. fai is generated using samtools index

File	Description
reference.dict	dictionary file
reference.fna.amb	record appearance of N (or other non-ATGC) in the ref fasta
reference.fna.ann	record ref sequences, name, length, etc
reference.fna.bwt	the Burrows-Wheeler transformed sequence
reference.fna.fai	index file
reference.fna.pac	packaged sequence (four base pairs encode one byte)
reference.fna.sa	suffix array index

✅ Testing

The workflow repository contains a small test dataset .tests/integration which can be run like so:

$ cd .tests/integration
$ snakemake -s ../../Snakefile -j1 --use-singularity

🚀 Usage

To use this module in your workflow, follow the description in the snakemake docs. Add the module to your Snakefile like so:

module alignment:
    snakefile:
        github(
            "hydra-genetics/alignment",
            path="workflow/Snakefile",
            tag="v0.1.0",
        )
    config:
        config


use rule * from alignment as alignment_*

Compatibility

Latest:

prealignment:v0.2.0

See COMPATIBLITY.md file for a complete list of module compatibility.

Input files

File	Description
`hydra-genetics/prealignment data`
`prealignment/fastp_pe/{sample}_{flowcell}_{lane}_{type}_fastq1.fastq.gz`	trimmed forward reads
`prealignment/fastp_pe/{sample}_{flowcell}_{lane}_{type}_fastq1.fastq.gz`	trimmed reverse reads
`original fastq files`
`PATH/fastq1.fastq.gz`	forward reads retrieved from units.tsv
`PATH/fastq2.fastq.gz`	reverse reads retrieved from units.tsv

Output files

The following output files should be targeted via another rule:

File	Description
`alignment/samtools_merge_bam/{sample}_{type}.bam`	aligned data which have been duplicate marked

Name		Name	Last commit message	Last commit date
Latest commit History 333 Commits
.github		.github
.tests		.tests
config		config
images		images
workflow		workflow
.gitignore		.gitignore
COMPATIBILITY.md		COMPATIBILITY.md
LICENSE.md		LICENSE.md
README.md		README.md
requirements.test.txt		requirements.test.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🐍 hydra-genetics/alignment

💬 Introduction

❗ Dependencies

🎒 Preparations

Sample and unit data

Reference data

✅ Testing

🚀 Usage

Compatibility

Input files

Output files

🧑‍⚖️ Rule Graph

Align and mark duplicates

About

Releases

Packages

Languages

License

Genomic-Medicine-Linkoping/hydra-genetics_alignment

Folders and files

Latest commit

History

Repository files navigation

🐍 hydra-genetics/alignment

💬 Introduction

❗ Dependencies

🎒 Preparations

Sample and unit data

Reference data

✅ Testing

🚀 Usage

Compatibility

Input files

Output files

🧑‍⚖️ Rule Graph

Align and mark duplicates

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages