-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inputting nuclei counts into cell2location #344
Comments
Hi @ssobt Looks like the main issue is indeed that a sample has many areas have very low counts. I would address that issue rather than figuring out how to input nuclei counts.
Please don't use older versions ( |
We don't support nuclei count use because we did not find providing that information useful in benchmarks and it is not available for many datasets. Does the analysis work well and provide the expected results with v.02-alpha? That said, I don't see why providing a 2D shape=(obs, 1) array to N_cells_per_location should be a problem. What error do you see in the latest version? |
Actually, I see a problem with the latest version. This line
|
Hi, thanks for the quick response. I don't see why that line in the last comment would cause problems. It would be a scalar divided by an array divided by another scalar so For some of your questions:
The results for v.02-alpha using a scalar value for 'cells_per_spot' wasn't able to call the low RNA content areas similar to the latest version. I tried running the latest version (v.0.1.3) with a dummy 1d array 1d array error output
2d array error output
|
Whether the problem have be sloved? I am interested in using the N_cells_per_location function by inputing nuclei counts. |
We are working on incorporating this information at the moment. It is not as simple as changing the above line but requires substantial changes to the model to effectively use segmentation-derived N_cells_per_location. While this will become possible in a month or so - you need to keep in mind that segmentation is not possible for all datasets and it is mostly reliable for FFPE protocols. |
Also, when you provide segmentation information Segmentation information and large |
Hi @vitkl First of all, thank you for your work ! I'm also interested so If you have any news about your previous comments (the possibility to input the number of cells per-spot instead of a sample-wise value), let us know 😄 Regards, |
Hi @vitkl ! Do you have any news ? 😄 |
Hi @benoitsam You can try using this experimental branch #337 (comment). I am planning to finalise this branch and its dependencies (scvi-tools) by December-February. |
Hi @vitkl Thanks for your reply ! I'll try to follow your instructions to install and use this new feature asap and I'll get back here with the results 😄 |
Hi @vitkl Quick comments for the #337 : InstallationThere was an error with the scvi-tools that you forked.
And then for cell2location with your ongoing branch : It works if I do UsageCould you elaborate on your comment about your N_cells_per_location comment ?
I'm not sure to understand what I'm supposed to use as input because I've got only a count of cells by spot 🤔 Regards, |
Thanks for suggesting the fix. Good to know. The idea is that cell abundance is proportional to the number of cells and % of the spot occupied by cells - so combining the two measures gives a better result. You can use a count of cells by spot too. You need to delete spots with 0 cells. |
Hi @vitkl Just to let you know, I managed to use this version on my laptop on a toy dataset. Run info:
For cell2location parameters :
The "out of memory" issue appears every time after the training completed (even with 30000 iterations, I've got the The job ends correctly if I use I wondered if you suspect that your modifications may have impacted the resources required to run cell2location. Regards, |
Hi @benoitsam It looks like the issue is with posterior sampling rather than training, and you run out of RAM, not GPU memory, right? The resource change may be due to the new version rather than to using these settings. Do you mean that you are using old parameters with new code? In general, I would recommend computing quantiles directly like this: # In this section, we export the estimated cell abundance (summary of the posterior distribution).
adata_vis = mod.export_posterior(
adata_vis, sample_kwargs={
'batch_size': int(np.ceil(adata_vis.n_obs / 8)), # this has to be done in batches due to a bug in the code new version
'accelerator': 'gpu',
'return_observed': False,
},
add_to_obsm=['q05', 'q95', 'q50'],
use_quantiles=True,
) |
Yes it seems to be an Out of Memory from RAM. I had no warning or error log about GPU memory or CUDA issues.
I meant that I used the same "configuration" when I used cell2location (v0.1.4) with
I tried your suggestion. It works locally with the toy dataset (1700 spots, 10 epoch for the training). Traceback (most recent call last):
File "/sps/lbmc/bsamson/vap/subworkflows/deconvolution/cell2location/fit_model_prior_by_spot.py", line 210, in <module>
main()
File "/sps/lbmc/bsamson/vap/subworkflows/deconvolution/cell2location/fit_model_prior_by_spot.py", line 185, in main
adata_vis = mod.export_posterior(
File "/pbs/throng/lbmc/bsamson/software/miniconda3/envs/cell2loc_prior_by_spot_env/lib/python3.10/site-packages/cell2location/models/_cell2location_model.py", line 520, in export_posterior
self.samples[f"post_sample_{i}"] = self.posterior_quantile(q=q, **sample_kwargs)
File "/pbs/throng/lbmc/bsamson/software/miniconda3/envs/cell2loc_prior_by_spot_env/lib/python3.10/site-packages/cell2location/models/base/_pyro_mixin.py", line 570, in posterior_quantile
return self._posterior_quantile_minibatch(exclude_vars=exclude_vars, batch_size=batch_size, **kwargs)
File "/pbs/throng/lbmc/bsamson/software/miniconda3/envs/cell2loc_prior_by_spot_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/pbs/throng/lbmc/bsamson/software/miniconda3/envs/cell2loc_prior_by_spot_env/lib/python3.10/site-packages/cell2location/models/base/_pyro_mixin.py", line 444, in _posterior_quantile_minibatch
valid_sites = self._get_valid_sites(args, kwargs, return_observed=return_observed)
AttributeError: 'Cell2location' object has no attribute '_get_valid_sites' EDIT: |
At last, it worked on my cluster for real samples 👍 |
Hi @vitkl ! I have some questions for you 😄
Regards, |
Hi @vitkl ! Do you have any news ? :) |
Hi @benoitsam
The limit on the number of spots is GPU memory. See this issue for suggestions about large data #356 - in particular, splitting the data into training batches stratified by Visium batch aka capture area. |
Hi @vitkl ! (and happy new year!) Thank you for your previous answer, I'll consider the merge of samples. I have some questions for you about my output of cell2location for the For some context :
Well, sorry for the long text and thank you in advance if you have read all that ! Benoit |
Hi, thank you for this tool! I have a question about entering in cell counts. I’m using an older version of cell2location (v.02-alpha) to input in nuclei counts for the 'the expected number of cells per location' hyperparamter. We’re having some trouble getting the latest version (v.0.1.3) to assign cell probabilities to most of the tissue due to high RNA variability after trying both 20 and 200 for alpha (see image below for alpha 200). Areas with low RNA content have very low probabilities assigned for any of the reference cell types. To try to alleviate the problem, we switched to the older version to input custom cell/nuclei counts. In v.02-alpha, I have inputted in a 1-dimensional numpy array with the nuclei counts of each spot (made from concatenating rows of 2d x,y array) on the Visium slide, the following error occurs asking for one value instead of locations specific values:
Gamma has no finite default value to use, checked: ('median', 'mean', 'mode'). Pass testval argument or adjust so value is finite.
I tried entering the 2d array directly and got the same error. The model only started to run when I entered one integer, so I was wondering how to input nuclei counts per each spot/location? Any advice on this would be great, thanks!
Here is the model setup:
The text was updated successfully, but these errors were encountered: