Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cells unassigned by using snp array #66

Open
yilevine opened this issue Jul 29, 2022 · 4 comments
Open

Cells unassigned by using snp array #66

yilevine opened this issue Jul 29, 2022 · 4 comments

Comments

@yilevine
Copy link

yilevine commented Jul 29, 2022

Hi,

I was trying to demultiplex 20k cells to 4 donors. But only a few cells were assigned to each donor. each donor was genotyped using Infinium Omni2.5Exome-8 v1.5.

The vcf file looks like this:
vcf

I used cellsnp-lite (v1.2.2) and vireoSNP(0.5.7) to get the results:

#cellsnp-lite code
cellsnp-lite -s $BAM -b $BARCODE -O $OUT_DIR -R $DONOR_VCF --minMAF 0.02 --minCOUNT 20 --gzip

#vireo code
vireo -c $CELL_DATA -d $DONOR_VCF -o $OUT_DIR -t GT -N $n_donor -M 200 --forcelearnGT

#donor_ids
vireo-2

I noticed that only 2462 SNPs were used to demultiplex these cells. Is that enough?

And in the donor_ids, most of the prob_max were pretty low. I wanted to change this parameter. Could you explain how to set it?

Moreover, I tried to use common variants genome1K.phase3.SNP_AF5e2.chr1toX.hg38.vcf as DONOR_VCF in cellnp-lit. And I got 90% assignment rate for each donor finally. So I was wondering if the SNP array (2.5M) I used is enough to demultiplex these cells?

Or do you have other suggestions on troubleshooting this issue? Thanks very much.

Yile

@huangyh09
Copy link
Collaborator

Hi Yile,

Thanks for sharing your experience. I agree with your inspection that the major reason is the insufficient number of SNPs probed by arrays. Usually, I would perform a genotype imputation first for array or WES based approaches, e.g., by Sanger imputation server, then it may give around 5~10 times more SNPs.

Alternatively, you may keep using your trial with common variants for a reference-free deconlovltuon. Then you can use the non-imputed genotype to match the demultiplexed donors, e.g., with this tutorial.

Yuanhua

@yilevine
Copy link
Author

yilevine commented Aug 8, 2022

Hi Yuanhua,

Thanks very much. I will try the solutions you provided.

And I also was wondering if you could share a protocol or pipeline on how to process array data. I am very new to microarray data. So I am not sure if the pipeline I am using is correct.

Thanks.

Yile

@connersk
Copy link

connersk commented Apr 10, 2023

Thanks for developing this helpful tool! I had a very similar issue with donor genotypes from a SNP array, but found that the workaround using reference-free deconvolution and your donor matching notebook worked! Before looking at the issues on GitHub, I didn't see a link to this notebook anywhere on your documentation, nor did I see any documentation suggesting that SNP array data wouldn't work for donor genotyping. Might help future users quite a bit if you added a bit of detail on best practices for using SNP array genotypes at https://vireosnp.readthedocs.io/en/latest/

@yuantiaotiao
Copy link

I was trying to demultiplex 20k cells to 4 donors. But only a few cells were assigned to each donor. each donor was genotyped using Infinium Omni2.5Exome-8 v1.5.

Hello, I also encountered this problem. How did you solve it in the end? thank you. My vcf file comes from ASA SNP array, and each sample is also assigned to a small number of cells.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants