Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing CoveragePlot results during differential peak analysis #1795

Open
si3 opened this issue Oct 4, 2024 · 0 comments
Open

Reproducing CoveragePlot results during differential peak analysis #1795

si3 opened this issue Oct 4, 2024 · 0 comments
Labels
documentation Documentation help

Comments

@si3
Copy link

si3 commented Oct 4, 2024

Hello

Thanks for developing this toolkit! I am trying to identify differentially accessible chromatin regions between cell subpopulation of an integrated Seurat object. The subpopulations are made up of cells from different scATAC-seq experiments and have different depths. Per the documentation, CoveragePlot displays aggregated pseudobulk values of Tn5 insertion sites on the X axis which are normalized using a "per-group scaling factor computed as the number of cells in the group multiplied by the mean sequencing depth for that group of cells".

You can see an example here, where clusters 0 and 8 show clear differences in peak height:

SOX2_Cov_plot.pdf

While I see clear differences in peak height in this plot, the same region does not appear as a differential peak using the standard FindMarkers workflow. In fact, I obtain almost no differentially accessible peaks in the subpopulation of interest (cluster 8) across the entire genome. I've read here on github and elsewhere that sequencing depth is a large factor influencing the number of peaks recovered, but that doesn't seem to be an issue when using the CoveragePlot function.

My question is the following:
How can I pseudobulk values and normalize values in such a way as to yield results that are consistent with what is obtained using CoveragePlot, even for cell clusters that have low coverage? I've tried using AggregateExpression and AverageExpression without success. There doesn't seem to be a vignette that specifically addresses how to do this type of pseudobulk analysis using the same normalization approach as the CoveragePlot function.

@si3 si3 added the documentation Documentation help label Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Documentation help
Projects
None yet
Development

No branches or pull requests

1 participant