-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The code to preprocess data #3
Comments
You can just use the usual scanpy preprocess. However, the input into the algorithm is counts data, so rmb to save the counts data in the layer = "counts". Then for multiple samples, you can set the categorical_covariate_keys= [your_batch]. We will update our docs soon to make it more clear. |
Got it! Thanks. |
Hi! I have a follow-up question regarding this. If I have a similar situation, with multiple samples that I would like to combine, but also want to account for the spatial neighbors, as in the mouse brain example (https://jinmiaochenlab.github.io/scTM/notebooks/stamp/example2/), is this possible? I'm a bit confused, as it seems in the mouse brain tutorial the data is explicitly provided as a covariate, whereas in other tutorials, such as that with lung cancer, the neighbor graph is created but not provided as a model covariate (https://jinmiaochenlab.github.io/scTM/notebooks/stamp/example3/). If multiple samples are provided (and their distinction is included as a covariate), are the spatial graphs from each spatial sample still considered in the model? |
The covariate term is used to correct for batch effects, so in the SMI data, there is no batch since there is only 1 slice. The model is agnostic to the graph built, so if you want build the separate graph for each batch, sq.gr.spatial_neighbors(adata, library_key="data") does that. The library key builts disjoint graphs for each batch there. |
@Chengwei94 Thanks so much for your fast reply! Noted that I can merge samples and build the graphs after using the key. So, I suppose in the mouse example the "data" is the obs with the slice info, as is the "library_id" in the multi-sample example (https://jinmiaochenlab.github.io/scTM/notebooks/stamp/example6/)? I was just getting confused by the "data" naming aspect (thinking it was another layer or something), but I think I understand now this is just an arbitrary name. I suppose then the neighbors for the multi-slice would've been previously built by |
Yep, you are right on that |
Hi, I've encountered with a normalized dataset for the data availability. I wonder if STAMP works on normalized ST data (the technique with fair low resolution). Thanks! |
Hi,
The STAMP is quiet an efficient algorithm to integrate multi samples. But I wonder how to preprocess data. For example, I have four samples generated by 10X Visium, and how can I preprocess the data to the format as the input for STAMP. I would like to know how this part of the code is implemented. Thank you for your help!
Sincerely,
Sia G
The text was updated successfully, but these errors were encountered: