The code to preprocess data #3

SiaGuo · 2024-10-21T01:04:04Z

Hi,

The STAMP is quiet an efficient algorithm to integrate multi samples. But I wonder how to preprocess data. For example, I have four samples generated by 10X Visium, and how can I preprocess the data to the format as the input for STAMP. I would like to know how this part of the code is implemented. Thank you for your help!

Sincerely,
Sia G

Chengwei94 · 2024-10-22T11:50:00Z

@SiaGuo,

You can just use the usual scanpy preprocess. However, the input into the algorithm is counts data, so rmb to save the counts data in the layer = "counts". Then for multiple samples, you can set the categorical_covariate_keys= [your_batch]. We will update our docs soon to make it more clear.

SiaGuo · 2024-10-24T01:13:52Z

Got it! Thanks.

katimbach · 2024-10-30T12:03:14Z

Hi! I have a follow-up question regarding this. If I have a similar situation, with multiple samples that I would like to combine, but also want to account for the spatial neighbors, as in the mouse brain example (https://jinmiaochenlab.github.io/scTM/notebooks/stamp/example2/), is this possible?

I'm a bit confused, as it seems in the mouse brain tutorial the data is explicitly provided as a covariate, whereas in other tutorials, such as that with lung cancer, the neighbor graph is created but not provided as a model covariate (https://jinmiaochenlab.github.io/scTM/notebooks/stamp/example3/). If multiple samples are provided (and their distinction is included as a covariate), are the spatial graphs from each spatial sample still considered in the model?

Chengwei94 · 2024-10-30T12:26:01Z

@katimbach

The covariate term is used to correct for batch effects, so in the SMI data, there is no batch since there is only 1 slice. The model is agnostic to the graph built, so if you want build the separate graph for each batch, sq.gr.spatial_neighbors(adata, library_key="data") does that. The library key builts disjoint graphs for each batch there.

katimbach · 2024-10-30T12:45:34Z

@Chengwei94 Thanks so much for your fast reply! Noted that I can merge samples and build the graphs after using the key.

So, I suppose in the mouse example the "data" is the obs with the slice info, as is the "library_id" in the multi-sample example (https://jinmiaochenlab.github.io/scTM/notebooks/stamp/example6/)? I was just getting confused by the "data" naming aspect (thinking it was another layer or something), but I think I understand now this is just an arbitrary name. I suppose then the neighbors for the multi-slice would've been previously built by sq.gr.spatial_neighbors(adata, library_key="library_id") ?

Chengwei94 · 2024-10-30T13:17:17Z

@katimbach

Yep, you are right on that

SiaGuo · 2024-11-21T02:10:52Z

@SiaGuo,

You can just use the usual scanpy preprocess. However, the input into the algorithm is counts data, so rmb to save the counts data in the layer = "counts". Then for multiple samples, you can set the categorical_covariate_keys= [your_batch]. We will update our docs soon to make it more clear.

Hi, I've encountered with a normalized dataset for the data availability. I wonder if STAMP works on normalized ST data (the technique with fair low resolution). Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The code to preprocess data #3

The code to preprocess data #3

SiaGuo commented Oct 21, 2024

Chengwei94 commented Oct 22, 2024

SiaGuo commented Oct 24, 2024

katimbach commented Oct 30, 2024

Chengwei94 commented Oct 30, 2024

katimbach commented Oct 30, 2024

Chengwei94 commented Oct 30, 2024

SiaGuo commented Nov 21, 2024

The code to preprocess data #3

The code to preprocess data #3

Comments

SiaGuo commented Oct 21, 2024

Chengwei94 commented Oct 22, 2024

SiaGuo commented Oct 24, 2024

katimbach commented Oct 30, 2024

Chengwei94 commented Oct 30, 2024

katimbach commented Oct 30, 2024

Chengwei94 commented Oct 30, 2024

SiaGuo commented Nov 21, 2024