Differential expression #22

DzenisKoca · 2022-08-19T14:16:36Z

Hello,

First of all, thank you for this tool. ALRA seems really convincing and is performing really fast. I wanted to ask few questions regarding the use of ALRA. Data I am using contains 6 samples, from 6 different mice, 3 of wich are KO for certain gene. I want to compare 3 KO to 3 WT.

Should ALRA be run on data that was normalized using Seurat's SCTransform() function?
If I am using ALRA on few samples that I want to integrate using Seurat, is it ok to run ALRA on each sample individualy, before integration? I am using Seurat's SCTransform integration pipeline.
After integration, can I use data imputed by ALRA to perform differential expression analysis?
After integration, should I use some tool to remove batch effects between samples (3KO and 3WT individualy)?

Rohit-Satyam · 2023-07-06T19:10:20Z

Hi @JunZhao1990 @linqiaozhi @rcannood @inoue0426 I have a similar query so I am not opening new issue. But could you please answer this? @DzenisKoca did you figure this out?

DzenisKoca · 2023-07-10T15:05:21Z

Hello @Rohit-Satyam ,

Well I figured it out partially.

1/2. I needed to integrate data and since integration via SCTransform pipeline could not be performed using the alra assay, I didn't use the SCTransform function. Instead, I rand ALRA on each sample, before integration, then after I performed PCA, I integrated data using harmony. I tried this method while reanalyzing couple publicly available datasets, and results were satisfying. Outcome was comparable, if not improved, to what was published previously.

I preferred not to run differential expression on imputed data since I found no benchmark of this.
I didn't dwell into this anymore.

I hope this helps.

Rohit-Satyam · 2023-07-12T13:53:56Z

Hi @DzenisKoca. Thanks for your response. Yes I went through the other ALRA issues where it was discouraged to run SCT on imputed data due the the assumptions SCT make about the data. So I am sticking to log normalization. However, I have few more questions. When I run ScoreJackStraw to determine number of PCs for downstream analysis, I get a plot like this with all P values zero:

Do you know what might be causing this?
The code I used was

## n = normal; t= drug treatment, 1 and 2 are time points T1 and T2
sample.list <- list(n1=n1,t1=t1,n2=n2,t2=t2)

## I intend to use alra imputed matrix for integration
sample.list <- lapply(X = sample.list, FUN = function(x) {
  x <- NormalizeData(x)
  x <- RunALRA(x, assay="RNA",slot="data")
  x <- FindVariableFeatures(x, nfeatures = 2000,selection.method = "vst")
})

## Malaria Cell Atlas. Don't want to perform imputation
mca.seurat <- mca.seurat %>% NormalizeData() %>% FindVariableFeatures(nfeatures = 2000,selection.method = "vst")
sample.list[5] <- mca.seurat
names(sample.list)[5] <- "mca"

saveRDS(sample.list,"sample.list.rds")
features <- SelectIntegrationFeatures(object.list = sample.list)
plasmodium.anchors <- FindIntegrationAnchors(object.list = sample.list, anchor.features = features)  
plasmodium.combined <- IntegrateData(anchorset = plasmodium.anchors)

DefaultAssay(plasmodium.combined) <- "integrated"

# Run the standard workflow for visualization and clustering
plasmodium.combined <- ScaleData(plasmodium.combined, verbose = TRUE, vars.to.regress = "percent.mt")
plasmodium.combined <- RunPCA(plasmodium.combined, verbose = TRUE)
plasmodium.combined<- JackStraw(object = plasmodium.combined, reduction = "pca", dims = 50, num.replicate = 100,  prop.freq = 0.1, verbose = TRUE)
plasmodium.combined <- ScoreJackStraw(object = plasmodium.combined, dims = 1:50, reduction = "pca")
JackStrawPlot(object = plasmodium.combined, dims = 1:50, reduction = "pca")
ElbowPlot(plasmodium.combined, ndims = 50)

@linqiaozhi In your paper you used the Jackstraw Plot to decide number of PCs:

After imputation with each method, the number of PCs to retain for each was chosen by the jackstraw method as implemented in Seurat. PCs with an assigned p-value of 1 × 10−5 or smaller were retained.

I hope you can shed some light as well.

DzenisKoca · 2023-07-12T14:38:37Z

Hello,

as it can be seen here, authors didn't investigate whether ALRA should be run on integrated data or before the integration. I have not found the answer to this question yet. As suggested by this thread, I have run the integration pipeline with harmony (on ALRA imputed data), and results I obtained were satisfying.

I am not sure what is happening with JackStraw, I have not encountered similar issue yet.

Rohit-Satyam · 2023-07-13T13:05:40Z

Hi @DzenisKoca

Yes I couldn't find any study where the recommended way of running ALRA was explored properly. But it make sense biologically to run ALRA imputation separately on data when you have normal and drug-treated single cells. And thus one can perform integration in Seurat using something like this:

features <- SelectIntegrationFeatures(object.list = sample.list, assay = c("alra","alra","alra","alra","RNA"))
plasmodium.anchors <- FindIntegrationAnchors(object.list = sample.list, anchor.features = features,assay = c("alra","alra","alra","alra","RNA"))  
plasmodium.combined <- IntegrateData(anchorset = plasmodium.anchors)

Even when assay argument isn't provided, the function will automatically take data from default assay (if u ran RunALRA, the default will be "alra"). Also, are you suggesting that IntegrateData function do not consider "alra" data slot?

Though benchmarking paper ranks harmony in the top tools for integration, for our malaria dataset, we observed it to be performing over-correction (this was also observed in another study published here). So I am a little hesitant using it here.

Rohit-Satyam mentioned this issue Jul 13, 2023

Which selection.method to use for FindingVariableFeatures on ALRA imputed data #25

Open

Rohit-Satyam mentioned this issue Jul 15, 2023

Which assay and slot does IntegrateData uses satijalab/seurat#7575

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differential expression #22

Differential expression #22

DzenisKoca commented Aug 19, 2022

Rohit-Satyam commented Jul 6, 2023 •

edited

Loading

DzenisKoca commented Jul 10, 2023

Rohit-Satyam commented Jul 12, 2023 •

edited

Loading

DzenisKoca commented Jul 12, 2023

Rohit-Satyam commented Jul 13, 2023 •

edited

Loading

Differential expression #22

Differential expression #22

Comments

DzenisKoca commented Aug 19, 2022

Rohit-Satyam commented Jul 6, 2023 • edited Loading

DzenisKoca commented Jul 10, 2023

Rohit-Satyam commented Jul 12, 2023 • edited Loading

DzenisKoca commented Jul 12, 2023

Rohit-Satyam commented Jul 13, 2023 • edited Loading

Rohit-Satyam commented Jul 6, 2023 •

edited

Loading

Rohit-Satyam commented Jul 12, 2023 •

edited

Loading

Rohit-Satyam commented Jul 13, 2023 •

edited

Loading