Replies: 2 comments 6 replies
-
Hi @khkk378 , interesting idea - I'm not sure though whether that will give you the reliability estimates that you want. 100-200 samples will be too small of a training dataset (but maybe you mean something different). I never tried to estimate more than say 10-15 different celltypes, then it gets very tricky. 100 different celltypes is extremely challenging and I have the feeling that you would get rather nonsense results from that :-) Regarding scaling, the networks should be expressive enough to deal with that. Nevertheless, I don't think that would work. I know that missing uncertainty estimates are a major drawback of Scaden currently, and I have planned to include something like this soon. If you're interested, the easiest way of including that now would be to run the different Scaden models with dropout enabled during prediction time for say 100 times and then average the results. The standard deviation of those results would give you some uncertainty estimate. Let me know if you're interested to try this out, I wanted to test it too at some point. Turning this into a discussion. |
Beta Was this translation helpful? Give feedback.
-
So, I mean 100-200 donor/tissue combinations. So, say, 5 million cells in total. I'm not really talking about the uncertainty of the estimates from a statistical viewpoint (although also needed) but from the biological. A model could give consistent but wrong results. Say I want to estimate cell type fractions in kidney. There are maybe 20 cell types there. Then I was thinking about also including, say, hepatocytes, pancreatic beta cells and so on in the model. Cell types that I know aren't in kidney. I expect zero estimates for all of those, and the deviance from that could be a metric for how prone the model is for picking up more technical aspects of the data (say damaged cells). |
Beta Was this translation helpful? Give feedback.
-
I need a way to assess the biological reliability of the estimates. One way I was thinking of was to include a bunch of cell types that I know aren't part of my tissue of interest, and then use those estimates as a metric for reliability. That would include training on 100-200 samples, with maybe 100 cell types in total. Do you think I would need to scale the networks for that?
Beta Was this translation helpful? Give feedback.
All reactions