-
Notifications
You must be signed in to change notification settings - Fork 27
Outputs
Depending on how exactly you configure your CLI execution, one should expect to see these files in the output/final/
folder:
*.A.txt.gz
*.C.txt.gz
*.G.txt.gz
*.T.txt.gz
*.coverage.txt.gz
*.depthTable.txt
**_refAllele.txt
*.rds
*.signac.rds
In order, the *{A,C,G,T}.txt.gz
files will be formatted as sparse matrices, indicating the position, cell, and then forward / reverse strand count abundances of that letter for that cell / position. These files enumerate all of the sequenced alleles for all cells in the mitochondrial DNA and are the minimal units to be utilized from mgatk
.
For convenience, the tool also emits a mean per cell depth in the *.depthTable.txt
file. The is computed as the (total bases accounted for) / (length of mtDNA contig). Additionally, the *.coverage.txt.gz
provides a sparse matrix representation of the per-cell, per-position coverage.
To orient these abundances in the context of potential mutations, the **_refAllele.txt
file shows the reference alleles for the contig used in alignment/processing. This file will be independent of your source data and purely a function of the chosen reference.
Finally, two .rds
files are automatically emitted that synthesize these files. The *.signac.rds
file contains an S3 object that can be rapidly integrated in the Signac R package (see vignettes here: https://satijalab.org/signac/). The other *.rds
file is a RangedSummarizedExperiment
that similarly summarizes all data in a slightly different S4 file object. Either of these can be rapidly integrated into existing scATAC-seq workflows, depending on your analysis method of choice.
Please raise an issue here