Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question About Karyotype Patterns in scATAC-seq #36

Open
cfusterot opened this issue Dec 5, 2024 · 4 comments
Open

Question About Karyotype Patterns in scATAC-seq #36

cfusterot opened this issue Dec 5, 2024 · 4 comments

Comments

@cfusterot
Copy link

Dear authors,

First off, thanks so much for creating and sharing this tool—it’s been incredibly helpful!

I’ve been analyzing some scATAC-seq data and noticed something interesting: in a few samples, most of the gain events seem to cluster around the edges between chromosomes. Do you have any idea why that might be happening? Could it have something to do with chromosomal breakpoints?

I’m using a windowSize of 1e5 for the calculations—could tweaking this parameter help address the issue?

Thanks for your time, and I’d really appreciate any advice you can share!

Best,
Coral

@KatharinaSchmid
Copy link
Contributor

Hi Coral,

thanks, good that you liked epiAneufinder so far :)

Could you specify a bit more what you mean with "edges between the chromosomes"? Or maybe share one of the karyogram plots if this is possible?

Do you mean gains in the telomeric regions? In general, the mapping quality reduces at the telomers (repetitive regions) and we try to remove them therefore using the file with the blacklisted regions. Just to double-check here, you added the correct blacklist file for your genome version? But nevertheless, the removal of the blacklisted regions might not be perfect, we would be careful how to interpret this, whether it is biological meaningful or technical artifacts. Especially if you see relatively small gains at the ends.

Best,
Katharina

@cfusterot
Copy link
Author

cfusterot commented Dec 9, 2024

Dear Katharina,

Thank you so much for your quick response! I am attaching an image to show you what I meant with "edges" in this context. Here, I've highlighted a couple of places where these gains happen but you can see that there are more than a few instances were this happens. To obtain these plots, the hg38-blacklist.v2.bed blacklist was used, together with the BSgenome.Hsapiens.UCSC.hg38; windowsize was set to 1e5.

What would your recommendation be? Do you think I could tweak a bit more the function parameters? or maybe applying a correction score based on the location of the gain?
Untitled

Once again, thanks a lot for your time!!

Best,
Coral

@KatharinaSchmid
Copy link
Contributor

Hi, thanks for sharing the plots. Without knowing the exact biological background of your data, we checked in general our previously analyzed cancer datasets again. We saw sometimes small losses at the telomers, but never such gains as in your case. To reduce the probability that this is noise, you could rerun it with an increased window size (e.g. 5e5 or 1e6). But this reduces of course also your chances to find small real CNVs, it is always a trade-off. You might also want to look into the biology of your dataset, whether this is plausible in your specific scenario and whether this is visible in other similar datasets. Good luck with your analyses!
Best, Katharina

@cfusterot
Copy link
Author

Thank you so much for your response!

I’ll definitely re-run it with different window sizes to observe how it changes. I’ll follow up if I come across any interesting conclusions. :)

Best regards,
Coral

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants