diff --git a/vignettes/MOFA_to_COSMOS.md b/vignettes/MOFA_to_COSMOS.md index 43a0627..f2eaebc 100644 --- a/vignettes/MOFA_to_COSMOS.md +++ b/vignettes/MOFA_to_COSMOS.md @@ -5,25 +5,19 @@ In this tutorial, we use the new Meta-Footprint Analysis MOON to score a mechanistic network. -``` r -#install cosmosR from github directly to obtain the newest version -if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools") -if (!requireNamespace("cosmosR", quietly = TRUE)) devtools::install_github("saezlab/cosmosR") -``` + #install cosmosR from github directly to obtain the newest version + if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools") + if (!requireNamespace("cosmosR", quietly = TRUE)) devtools::install_github("saezlab/cosmosR") DecoupleR is used to score various regulatory interactions. -``` r -if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") -if (!requireNamespace("decoupleR", quietly = TRUE)) BiocManager::install("saezlab/decoupleR") -``` + if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") + if (!requireNamespace("decoupleR", quietly = TRUE)) BiocManager::install("saezlab/decoupleR") We are using the LIANA package to process the LR interactions. -``` r -if (!requireNamespace("remotes", quietly = TRUE)) install.packages("remotes") -if (!requireNamespace("liana", quietly = TRUE)) remotes::install_github('saezlab/liana') -``` + if (!requireNamespace("remotes", quietly = TRUE)) install.packages("remotes") + if (!requireNamespace("liana", quietly = TRUE)) remotes::install_github('saezlab/liana') We are using MOFA2 (Argelaguet et al., 2018) to find decompose variance across different omics and use COSMOS to find mechanistic @@ -31,15 +25,13 @@ interpretations of its factors. However, since we are using the python version of MOFA2, please make sure to have [mofapy2](https://github.com/bioFAM/mofapy2) as well as panda and numpy installed in your (conda) environment (e.g. by using reticulate: -reticulate::use_condaenv(“base”, required = T) %\>% -reticulate::conda_install(c(“mofapy2”,“panda”,“numpy”)). For downstream +reticulate::use\_condaenv(“base”, required = T) %>% +reticulate::conda\_install(c(“mofapy2”,“panda”,“numpy”)). For downstream analysis, the R version of MOFA2 is used. -``` r -# Install MOFA2 from bioconductor (stable version) -if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") -if (!requireNamespace("MOFA2", quietly = TRUE)) BiocManager::install("MOFA2") -``` + # Install MOFA2 from bioconductor (stable version) + if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") + if (!requireNamespace("MOFA2", quietly = TRUE)) BiocManager::install("MOFA2") General information (tutorials) on how to use COSMOS and MOFA2, see [COSMOS](https://saezlab.github.io/cosmosR/articles/tutorial.html) and @@ -48,30 +40,28 @@ General information (tutorials) on how to use COSMOS and MOFA2, see For data manipulation, visualization, etc. further packages are needed which are loaded here along with MOFA2 and COSMOS. -``` r -#imports -library(readr) - -#core methods -library(cosmosR) -library(MOFA2) -library(liana) -library(decoupleR) - -#data manipulations -library(dplyr) -library(reshape2) -library(GSEABase) -library(tidyr) - -#plotting -library(ggplot2) -library(ggfortify) -library(pheatmap) -library(gridExtra) -library(RColorBrewer) -# library(RCy3) -``` + #imports + library(readr) + + #core methods + library(cosmosR) + library(MOFA2) + library(liana) + library(decoupleR) + + #data manipulations + library(dplyr) + library(reshape2) + library(GSEABase) + library(tidyr) + + #plotting + library(ggplot2) + library(ggfortify) + library(pheatmap) + library(gridExtra) + library(RColorBrewer) + # library(RCy3) ### From omics data to MOFA ready input @@ -87,123 +77,109 @@ scripts for pre-processing can be found in the associated folder. Here, we have the following views (with samples as columns and genes as rows): transcriptomics (RNA-seq -[PMID31113817](https://pubmed.ncbi.nlm.nih.gov/31113817/)), +[](https://pubmed.ncbi.nlm.nih.gov/31113817/)), proteomics (SWATH (Mass spectrometry) -[PMID31733513](https://pubmed.ncbi.nlm.nih.gov/31733513/%3E)} and +[](https://pubmed.ncbi.nlm.nih.gov/31733513/%3E)} and metabolomics (LC/MS & GC/MS (Mass spectrometry) [DTP NCI60 data](https://wiki.nci.nih.gov/display/NCIDTPdata/Molecular+Target+Data)). The group information is stored inside the RNA metadata and based on -transcription factor clustering (see scripts/RNA/Analyze_RNA.R). +transcription factor clustering (see scripts/RNA/Analyze\_RNA.R). Further, only samples are kept for which each omic is available (however this is not necessary since MOFA2 is able to impute data). -``` r -# Transcriptomics -## Load data -RNA <- as.data.frame(read_csv("data/RNA/RNA_log2_FPKM_clean.csv")) -rownames(RNA) <- RNA[,1] -RNA <- RNA[,-1] - -## Remove genes with excessive amount of NAs (only keep genes with max. amount -#of NAs = 33.3 % across cell lines) -RNA <- RNA[rowSums(is.na(RNA))<(dim(RNA)[2]/3),] - -## Only keep highly variable genes (as suggested by MOFA): here top ≈70% genes -#to keep >75% of TFs (103 out of 133). For stronger selection, we can further -#reduce the number of genes (e.g. top 50%) -RNA_sd <- sort(apply(RNA, 1, function(x) sd(x,na.rm = T)), decreasing = T) -hist(RNA_sd, breaks = 100) -``` - -![](MOFA_to_COSMOS_files/figure-gfm/Preparing%20MOFA%20input-1.png) - -``` r -hist(RNA_sd[1:6000], breaks = 100) -``` - -![](MOFA_to_COSMOS_files/figure-gfm/Preparing%20MOFA%20input-2.png) - -``` r -RNA_sd <- RNA_sd[1:6000] -RNA <- RNA[names(RNA_sd),] - -# Proteomics -## Load data -proteo <- as.data.frame(read_csv("data/proteomic/Prot_log10_SWATH_clean.csv")) -rownames(proteo) <- proteo[,1] -proteo <- proteo[,-1] - -## Only keep highly variable proteins (as suggested by MOFA): here top ≈60%. -#For stronger selection, we can further reduce the number of proteins (e.g. only keep top 50%) -proteo_sd <- sort(apply(proteo, 1, function(x) sd(x,na.rm = T)), decreasing = T) -hist(proteo_sd, breaks = 100) -``` - -![](MOFA_to_COSMOS_files/figure-gfm/Preparing%20MOFA%20input-3.png) - -``` r -hist(proteo_sd[1:round(dim(proteo)[1]*3/5)], breaks = 100) -``` - -![](MOFA_to_COSMOS_files/figure-gfm/Preparing%20MOFA%20input-4.png) - -``` r -proteo_sd <- proteo_sd[1:round(dim(proteo)[1]*3/5)] -proteo <- proteo[names(proteo_sd),] - -# Metabolomics -## Load data -metab <- as.data.frame(read_csv("data/metabolomic/metabolomic_clean_vsn.csv")) -rownames(metab) <- metab[,1] -metab <- metab[,-1] - -## Since we have only a limited number of metabolite measurements, -#all measurements are kept here -metab_sd <- apply(metab, 1, function(x) sd(x,na.rm=T)) -hist(metab_sd, breaks = 100) -``` - -![](MOFA_to_COSMOS_files/figure-gfm/Preparing%20MOFA%20input-5.png) - -``` r -# Create long data frame -## Only keep samples with each view present -overlap_patients <- intersect(intersect(names(RNA),names(proteo)),names(metab)) -RNA <- RNA[,overlap_patients] -proteo <- proteo[,overlap_patients] -metab <- metab[,overlap_patients] - -## Create columns required for MOFA -### RNA -RNA <- melt(as.data.frame(cbind(RNA,row.names(RNA)))) -RNA$view <- "RNA" -RNA <- RNA[,c(2,1,4,3)] -names(RNA) <- c("sample","feature","view","value") - -### Proteomics -proteo <- melt(as.data.frame(cbind(proteo,row.names(proteo)))) -proteo$view <- "proteo" -proteo <- proteo[,c(2,1,4,3)] -names(proteo) <- c("sample","feature","view","value") - -### Metabolomics -metab <- melt(as.data.frame(cbind(metab,row.names(metab)))) -metab$view <- "metab" -metab <- metab[,c(2,1,4,3)] -names(metab) <- c("sample","feature","view","value") - -## Merge long data frame to one -mofa_ready_data <- as.data.frame(do.call(rbind,list(RNA,proteo,metab))) - -# Save MOFA metadata information and MOFA input -write_csv(mofa_ready_data, file = "data/mofa/mofa_data.csv") -``` + # Transcriptomics + ## Load data + RNA <- as.data.frame(read_csv("data/RNA/RNA_log2_FPKM_clean.csv")) + rownames(RNA) <- RNA[,1] + RNA <- RNA[,-1] + + ## Remove genes with excessive amount of NAs (only keep genes with max. amount + #of NAs = 33.3 % across cell lines) + RNA <- RNA[rowSums(is.na(RNA))<(dim(RNA)[2]/3),] + + ## Only keep highly variable genes (as suggested by MOFA): here top ≈70% genes + #to keep >75% of TFs (103 out of 133). For stronger selection, we can further + #reduce the number of genes (e.g. top 50%) + RNA_sd <- sort(apply(RNA, 1, function(x) sd(x,na.rm = T)), decreasing = T) + hist(RNA_sd, breaks = 100) + +![](MOFA_to_COSMOS_files/figure-markdown_strict/Preparing%20MOFA%20input-1.png) + + hist(RNA_sd[1:6000], breaks = 100) + +![](MOFA_to_COSMOS_files/figure-markdown_strict/Preparing%20MOFA%20input-2.png) + + RNA_sd <- RNA_sd[1:6000] + RNA <- RNA[names(RNA_sd),] + + # Proteomics + ## Load data + proteo <- as.data.frame(read_csv("data/proteomic/Prot_log10_SWATH_clean.csv")) + rownames(proteo) <- proteo[,1] + proteo <- proteo[,-1] + + ## Only keep highly variable proteins (as suggested by MOFA): here top ≈60%. + #For stronger selection, we can further reduce the number of proteins (e.g. only keep top 50%) + proteo_sd <- sort(apply(proteo, 1, function(x) sd(x,na.rm = T)), decreasing = T) + hist(proteo_sd, breaks = 100) + +![](MOFA_to_COSMOS_files/figure-markdown_strict/Preparing%20MOFA%20input-3.png) + + hist(proteo_sd[1:round(dim(proteo)[1]*3/5)], breaks = 100) + +![](MOFA_to_COSMOS_files/figure-markdown_strict/Preparing%20MOFA%20input-4.png) + + proteo_sd <- proteo_sd[1:round(dim(proteo)[1]*3/5)] + proteo <- proteo[names(proteo_sd),] + + # Metabolomics + ## Load data + metab <- as.data.frame(read_csv("data/metabolomic/metabolomic_clean_vsn.csv")) + rownames(metab) <- metab[,1] + metab <- metab[,-1] + + ## Since we have only a limited number of metabolite measurements, + #all measurements are kept here + metab_sd <- apply(metab, 1, function(x) sd(x,na.rm=T)) + hist(metab_sd, breaks = 100) + +![](MOFA_to_COSMOS_files/figure-markdown_strict/Preparing%20MOFA%20input-5.png) + + # Create long data frame + ## Only keep samples with each view present + overlap_patients <- intersect(intersect(names(RNA),names(proteo)),names(metab)) + RNA <- RNA[,overlap_patients] + proteo <- proteo[,overlap_patients] + metab <- metab[,overlap_patients] + + ## Create columns required for MOFA + ### RNA + RNA <- melt(as.data.frame(cbind(RNA,row.names(RNA)))) + RNA$view <- "RNA" + RNA <- RNA[,c(2,1,4,3)] + names(RNA) <- c("sample","feature","view","value") + + ### Proteomics + proteo <- melt(as.data.frame(cbind(proteo,row.names(proteo)))) + proteo$view <- "proteo" + proteo <- proteo[,c(2,1,4,3)] + names(proteo) <- c("sample","feature","view","value") + + ### Metabolomics + metab <- melt(as.data.frame(cbind(metab,row.names(metab)))) + metab$view <- "metab" + metab <- metab[,c(2,1,4,3)] + names(metab) <- c("sample","feature","view","value") + + ## Merge long data frame to one + mofa_ready_data <- as.data.frame(do.call(rbind,list(RNA,proteo,metab))) + + # Save MOFA metadata information and MOFA input + write_csv(mofa_ready_data, file = "data/mofa/mofa_data.csv") The final data frame has the following structure: -``` r -head(mofa_ready_data) -``` + head(mofa_ready_data) ## sample feature view value ## 1 786-0 KRT8 RNA 7.261 @@ -229,39 +205,35 @@ vice-versa. The imperfect correlation between trnaxcripts and proteins reflects the complexity of the translation process, as well as protein and RNA stability. -``` r -condition_prot_RNA_correlation <- sapply(unique(mofa_ready_data$sample), function(condition){ - merged_data <- merge(mofa_ready_data[which(mofa_ready_data$sample == condition & mofa_ready_data$view == "RNA"),-c(1,3)], - mofa_ready_data[which(mofa_ready_data$sample == condition & mofa_ready_data$view == "proteo"),-c(1,3)], - by = "feature") - return(cor(merged_data[,2],merged_data[,3], use = "pairwise.complete.obs")) -}) + condition_prot_RNA_correlation <- sapply(unique(mofa_ready_data$sample), function(condition){ + merged_data <- merge(mofa_ready_data[which(mofa_ready_data$sample == condition & mofa_ready_data$view == "RNA"),-c(1,3)], + mofa_ready_data[which(mofa_ready_data$sample == condition & mofa_ready_data$view == "proteo"),-c(1,3)], + by = "feature") + return(cor(merged_data[,2],merged_data[,3], use = "pairwise.complete.obs")) + }) -hist(condition_prot_RNA_correlation, breaks = 20) -``` + hist(condition_prot_RNA_correlation, breaks = 20) -![](MOFA_to_COSMOS_files/figure-gfm/Supp%20figure%20condition_prot_RNA_correlation-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Supp%20figure%20condition_prot_RNA_correlation-1.png) -``` r -overlap_genes <- unique(intersect(proteo$feature,RNA$feature)) -single_prot_RNA_correlation <- sapply(overlap_genes, function(gene){ - merged_data <- merge(mofa_ready_data[which(mofa_ready_data$feature == gene & mofa_ready_data$view == "RNA"),-c(2,3)], - mofa_ready_data[which(mofa_ready_data$feature == gene & mofa_ready_data$view == "proteo"),-c(2,3)], - by = "sample") - return(cor(merged_data[,2],merged_data[,3], use = "pairwise.complete.obs")) -}) + overlap_genes <- unique(intersect(proteo$feature,RNA$feature)) + single_prot_RNA_correlation <- sapply(overlap_genes, function(gene){ + merged_data <- merge(mofa_ready_data[which(mofa_ready_data$feature == gene & mofa_ready_data$view == "RNA"),-c(2,3)], + mofa_ready_data[which(mofa_ready_data$feature == gene & mofa_ready_data$view == "proteo"),-c(2,3)], + by = "sample") + return(cor(merged_data[,2],merged_data[,3], use = "pairwise.complete.obs")) + }) -hist(single_prot_RNA_correlation, breaks = 100) -``` + hist(single_prot_RNA_correlation, breaks = 100) -![](MOFA_to_COSMOS_files/figure-gfm/Supp%20figure%20single_prot_RNA_correlation-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Supp%20figure%20single_prot_RNA_correlation-1.png) ### MOFA run After pre-processing the data into MOFA appropriate input, let’s perform the MOFA analysis. To change specific MOFA options the function -write_MOFA_options_file is used. For more information regarding option -setting see [MOFA option +write\_MOFA\_options\_file is used. For more information regarding +option setting see [MOFA option tutorial](https://raw.githack.com/bioFAM/MOFA2_tutorials/master/R_tutorials/getting_started_R.html). Since we don’t know how many factors are needed to explain our data well, various maximum numbers of factors are tested. The final results @@ -278,13 +250,11 @@ automatically redirected to the windows store page if it’s not. Then, you can run “python3 -m pip install –user mofapy2” in a terminal to install the module to python3. -``` r -for(i in c(5:15,20,length(unique(mofa_ready_data$sample)))){ -wd <- getwd() -write_MOFA_options_file(input_file = '/data/mofa/mofa_data.csv', output_file = paste0('/results/mofa/mofa_res_',i,'factor.hdf5'), factors = i, convergence_mode = "slow", likelihoods = c('gaussian','gaussian','gaussian')) -system(paste0(paste('python3', wd, sep = " "), paste('/scripts/mofa/mofa2.py', "options_MOFA.csv"))) -} -``` + for(i in c(5:15,20,length(unique(mofa_ready_data$sample)))){ + wd <- getwd() + write_MOFA_options_file(input_file = '/data/mofa/mofa_data.csv', output_file = paste0('/results/mofa/mofa_res_',i,'factor.hdf5'), factors = i, convergence_mode = "slow", likelihoods = c('gaussian','gaussian','gaussian')) + system(paste0(paste('python3', wd, sep = " "), paste('/scripts/mofa/mofa2.py', "options_MOFA.csv"))) + } ### Investigate MOFA output @@ -293,36 +263,30 @@ together with the metadata information. Since we don’t know a-priori which is the ideal number of factors, we can settle on the highest number of factor where the model stabilises. -``` r -# Load MOFA output -for(i in c(5:15,20,length(unique(mofa_ready_data$sample)))){ - filepath <- paste0('results/mofa/mofa_res_',i,'factor.hdf5') - model <- load_model(filepath) - meta_data <- read_csv("data/metadata/RNA_metadata_cluster.csv")[,c(1,2)] - colnames(meta_data) <- c("sample","cluster") - mofa_metadata <- merge(samples_metadata(model), meta_data, by = "sample") - names(mofa_metadata) <- c("sample","group","cluster") - samples_metadata(model) <- mofa_metadata - assign(paste0("model",i),model) -} -``` + # Load MOFA output + for(i in c(5:15,20,length(unique(mofa_ready_data$sample)))){ + filepath <- paste0('results/mofa/mofa_res_',i,'factor.hdf5') + model <- load_model(filepath) + meta_data <- read_csv("data/metadata/RNA_metadata_cluster.csv")[,c(1,2)] + colnames(meta_data) <- c("sample","cluster") + mofa_metadata <- merge(samples_metadata(model), meta_data, by = "sample") + names(mofa_metadata) <- c("sample","group","cluster") + samples_metadata(model) <- mofa_metadata + assign(paste0("model",i),model) + } Then, we can compare the factors from different MOFA models and check on consistency (here: model 7-13). -``` r -compare_factors(list(model7,model8,model9,model10,model11,model12,model13), cluster_rows = F, cluster_cols = F,) -``` + compare_factors(list(model7,model8,model9,model10,model11,model12,model13), cluster_rows = F, cluster_cols = F,) -![](MOFA_to_COSMOS_files/figure-gfm/Compare%20MOFA%20models-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Compare%20MOFA%20models-1.png) Here, we can see that the inferred MOFA factor weights do not change drastically around 9. For the sake of this tutorial, we choose the 9-factor MOFA model. -``` r -model <- model10 -``` + model <- model10 The next part deals with the analysis and interpretation of the MOFA factors with specific focus on the factor weights. The classic tutorial @@ -333,20 +297,16 @@ In this context, it is important to emphasize that the factors can be interpreted similarly to principal components (PCs) in a principal component analysis (PCA). -An initial metric we can explore is the total variance ($R^2$) explained -per view by our MOFA model. This helps us understand how well the model -represents the data. +An initial metric we can explore is the total variance (*R*2) +explained per view by our MOFA model. This helps us understand how well +the model represents the data. -``` r -# Investigate factors: Explained total variance per view -plot_variance_explained(model, x="group", y="factor", plot_total = T)[[2]] -``` + # Investigate factors: Explained total variance per view + plot_variance_explained(model, x="group", y="factor", plot_total = T)[[2]] -![](MOFA_to_COSMOS_files/figure-gfm/Total%20variance%20per%20factor-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Total%20variance%20per%20factor-1.png) -``` r -calculate_variance_explained(model) -``` + calculate_variance_explained(model) ## $r2_total ## $single_group @@ -387,74 +347,66 @@ cross-correlated than protein abundances (due to the underlying regulatory structures, e.i. reaction networks vs protein synthesis and degratation regulatory mechanisms). -``` r -#Get the cross correlation for RNA and proteomic data -RNA <- dcast(mofa_ready_data[which(mofa_ready_data$view == "RNA"),-3], feature~sample, value.var = "value") -row.names(RNA) <- RNA[,1] -RNA <- RNA[,-1] + #Get the cross correlation for RNA and proteomic data + RNA <- dcast(mofa_ready_data[which(mofa_ready_data$view == "RNA"),-3], feature~sample, value.var = "value") + row.names(RNA) <- RNA[,1] + RNA <- RNA[,-1] -RNA_crosscor <- cor(t(RNA), use = "pairwise.complete.obs") -# hist(RNA_crosscor[upper.tri(RNA_crosscor)], breaks = 1000) + RNA_crosscor <- cor(t(RNA), use = "pairwise.complete.obs") + # hist(RNA_crosscor[upper.tri(RNA_crosscor)], breaks = 1000) -prot <- dcast(mofa_ready_data[which(mofa_ready_data$view == "proteo"),-3], feature~sample, value.var = "value") -row.names(prot) <- prot[,1] -prot <- prot[,-1] + prot <- dcast(mofa_ready_data[which(mofa_ready_data$view == "proteo"),-3], feature~sample, value.var = "value") + row.names(prot) <- prot[,1] + prot <- prot[,-1] -prot_crosscor <- cor(t(prot), use = "pairwise.complete.obs") -# hist(prot_crosscor[upper.tri(prot_crosscor)], breaks = 1000) + prot_crosscor <- cor(t(prot), use = "pairwise.complete.obs") + # hist(prot_crosscor[upper.tri(prot_crosscor)], breaks = 1000) -metabo <- dcast(mofa_ready_data[which(mofa_ready_data$view == "metab"),-3], feature~sample, value.var = "value") -row.names(metabo) <- metabo[,1] -metabo <- metabo[,-1] + metabo <- dcast(mofa_ready_data[which(mofa_ready_data$view == "metab"),-3], feature~sample, value.var = "value") + row.names(metabo) <- metabo[,1] + metabo <- metabo[,-1] -metabo_crosscor <- cor(t(metabo), use = "pairwise.complete.obs") -# hist(metabo_crosscor[upper.tri(metabo_crosscor)], breaks = 1000) + metabo_crosscor <- cor(t(metabo), use = "pairwise.complete.obs") + # hist(metabo_crosscor[upper.tri(metabo_crosscor)], breaks = 1000) -# plot(calculate_variance_explained(model)$r2_total$single_group, c(mean(RNA_crosscor),mean(metabo_crosscor),mean(prot_crosscor))) -plot(calculate_variance_explained(model)$r2_total$single_group, c(dim(RNA_crosscor)[1],dim(metabo_crosscor)[1],dim(prot_crosscor)[1])) -``` + # plot(calculate_variance_explained(model)$r2_total$single_group, c(mean(RNA_crosscor),mean(metabo_crosscor),mean(prot_crosscor))) + plot(calculate_variance_explained(model)$r2_total$single_group, c(dim(RNA_crosscor)[1],dim(metabo_crosscor)[1],dim(prot_crosscor)[1])) -![](MOFA_to_COSMOS_files/figure-gfm/Chekc%20cross%20correlation%20of%20omics%20compared%20to%20variance%20epxlained-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Chekc%20cross%20correlation%20of%20omics%20compared%20to%20variance%20epxlained-1.png) -``` r -df_variance_crosscor <- as.data.frame(cbind(calculate_variance_explained(model)$r2_total$single_group, - c(mean(RNA_crosscor),mean(metabo_crosscor),mean(prot_crosscor)), - c(dim(RNA_crosscor)[1],dim(metabo_crosscor)[1],dim(prot_crosscor)[1]))) -names(df_variance_crosscor) <- c("var_expl","mean_crosscor","n_features") -df_variance_crosscor$prod <- sqrt(df_variance_crosscor$mean_crosscor) * df_variance_crosscor$n_features + df_variance_crosscor <- as.data.frame(cbind(calculate_variance_explained(model)$r2_total$single_group, + c(mean(RNA_crosscor),mean(metabo_crosscor),mean(prot_crosscor)), + c(dim(RNA_crosscor)[1],dim(metabo_crosscor)[1],dim(prot_crosscor)[1]))) + names(df_variance_crosscor) <- c("var_expl","mean_crosscor","n_features") + df_variance_crosscor$prod <- sqrt(df_variance_crosscor$mean_crosscor) * df_variance_crosscor$n_features -#Just plot them as scatter plot with regression lines -mofa_solved <- ggplot(df_variance_crosscor, aes(x = prod, y = var_expl)) + - geom_point() + - geom_smooth(method='lm', formula= y~x) + - theme_minimal() + - xlab("sqrt(mean cross-corelation of view) * number of features") + - ylab("variance explained for each view") + - ggtitle("MOFA size and cross cor influence") + #Just plot them as scatter plot with regression lines + mofa_solved <- ggplot(df_variance_crosscor, aes(x = prod, y = var_expl)) + + geom_point() + + geom_smooth(method='lm', formula= y~x) + + theme_minimal() + + xlab("sqrt(mean cross-corelation of view) * number of features") + + ylab("variance explained for each view") + + ggtitle("MOFA size and cross cor influence") -mofa_solved -``` + mofa_solved -![](MOFA_to_COSMOS_files/figure-gfm/Chekc%20cross%20correlation%20of%20omics%20compared%20to%20variance%20epxlained-2.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Chekc%20cross%20correlation%20of%20omics%20compared%20to%20variance%20epxlained-2.png) -``` r -mofa_nfeatures <- ggplot(df_variance_crosscor, aes(x = n_features, y = var_expl)) + - geom_point() + - geom_smooth(method='lm', formula= y~x) + - theme_minimal() + - xlab("sqrt(mean cross-corelation of view) * number of features") + - ylab("variance explained for each view") + - ggtitle("MOFA size influence") + mofa_nfeatures <- ggplot(df_variance_crosscor, aes(x = n_features, y = var_expl)) + + geom_point() + + geom_smooth(method='lm', formula= y~x) + + theme_minimal() + + xlab("sqrt(mean cross-corelation of view) * number of features") + + ylab("variance explained for each view") + + ggtitle("MOFA size influence") -mofa_nfeatures -``` + mofa_nfeatures -![](MOFA_to_COSMOS_files/figure-gfm/Chekc%20cross%20correlation%20of%20omics%20compared%20to%20variance%20epxlained-3.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Chekc%20cross%20correlation%20of%20omics%20compared%20to%20variance%20epxlained-3.png) -``` r -#Get the actual coeficient signficance -summary(lm(data= df_variance_crosscor, var_expl~prod)) -``` + #Get the actual coeficient signficance + summary(lm(data= df_variance_crosscor, var_expl~prod)) ## ## Call: @@ -475,9 +427,7 @@ summary(lm(data= df_variance_crosscor, var_expl~prod)) ## Multiple R-squared: 0.9987, Adjusted R-squared: 0.9974 ## F-statistic: 767.1 on 1 and 1 DF, p-value: 0.02297 -``` r -summary(lm(data= df_variance_crosscor, var_expl~n_features)) -``` + summary(lm(data= df_variance_crosscor, var_expl~n_features)) ## ## Call: @@ -497,20 +447,16 @@ summary(lm(data= df_variance_crosscor, var_expl~n_features)) ## F-statistic: 25.43 on 1 and 1 DF, p-value: 0.1246 Since we would like to use the found correlation for downstream COSMOS -analysis, we first investigate the variance ($R^2$) each factor can -explain per view as well as investigate the factor explaining the most -variance across all views. +analysis, we first investigate the variance (*R*2) each +factor can explain per view as well as investigate the factor explaining +the most variance across all views. -``` r -# Investigate factors: Explained variance per view for each factor -pheatmap(model@cache$variance_explained$r2_per_factor[[1]], display_numbers = T, angle_col = "0", legend_labels = c("0","10", "20", "30", "40", "Variance\n\n"), legend = T, main = "", legend_breaks = c(0,10, 20, 30, 40, max(model@cache$variance_explained$r2_per_factor[[1]])), cluster_rows = F, cluster_cols = F, color = colorRampPalette(c("white","red"))(100), fontsize_number = 10) -``` + # Investigate factors: Explained variance per view for each factor + pheatmap(model@cache$variance_explained$r2_per_factor[[1]], display_numbers = T, angle_col = "0", legend_labels = c("0","10", "20", "30", "40", "Variance\n\n"), legend = T, main = "", legend_breaks = c(0,10, 20, 30, 40, max(model@cache$variance_explained$r2_per_factor[[1]])), cluster_rows = F, cluster_cols = F, color = colorRampPalette(c("white","red"))(100), fontsize_number = 10) -![](MOFA_to_COSMOS_files/figure-gfm/Variance%20per%20view%20per%20factor-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Variance%20per%20view%20per%20factor-1.png) -``` r -pheatmap(model@cache$variance_explained$r2_per_factor[[1]], display_numbers = T, angle_col = "0", legend_labels = c("0","10", "20", "30", "40", "Variance\n\n"), legend = T, main = "", legend_breaks = c(0,10, 20, 30, 40, max(model@cache$variance_explained$r2_per_factor[[1]])), cluster_rows = F, cluster_cols = F, color = colorRampPalette(c("white","red"))(100), fontsize_number = 10,filename = "results/mofa/variance_heatmap.pdf",width = 4, height = 2.5) -``` + pheatmap(model@cache$variance_explained$r2_per_factor[[1]], display_numbers = T, angle_col = "0", legend_labels = c("0","10", "20", "30", "40", "Variance\n\n"), legend = T, main = "", legend_breaks = c(0,10, 20, 30, 40, max(model@cache$variance_explained$r2_per_factor[[1]])), cluster_rows = F, cluster_cols = F, color = colorRampPalette(c("white","red"))(100), fontsize_number = 10,filename = "results/mofa/variance_heatmap.pdf",width = 4, height = 2.5) By analyzing this heatmap, we can observe that Factor 1 and 5 explain a large proportion of variance of the RNA, Factor 2 and 4 mainly explains @@ -521,11 +467,9 @@ variance across all views. We can also analyze the correlation between factors. The next plot highlights the spearman correlation between each factor. -``` r -# Investigate factors: Correlation between factors -pdf(file="results/mofa/plot_factor_cor.pdf", height = 5, width = 5) -plot_factor_cor(model, type = 'upper', method = "spearman", addCoef.col = "black") -``` + # Investigate factors: Correlation between factors + pdf(file="results/mofa/plot_factor_cor.pdf", height = 5, width = 5) + plot_factor_cor(model, type = 'upper', method = "spearman", addCoef.col = "black") Next, we can characterize the factors with the available metadata of samples. Indeed, it can be interesting to see if some factors are @@ -542,9 +486,7 @@ category in a factor compared to any other tissue categories interpret insight that serves our purpose well. To do so, we use the Univariate Linear Model (ULM) function of decoupleR. -``` r -NCI_60_metadata <- as.data.frame(read_csv(file = "data/metadata/RNA_metadata_cluster.csv")) -``` + NCI_60_metadata <- as.data.frame(read_csv(file = "data/metadata/RNA_metadata_cluster.csv")) ## Rows: 60 Columns: 16 ## ── Column specification ──────────────────────────────────────────────────────── @@ -555,92 +497,86 @@ NCI_60_metadata <- as.data.frame(read_csv(file = "data/metadata/RNA_metadata_clu ## ℹ Use `spec()` to retrieve the full column specification for this data. ## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. -``` r -#discretize age category so that regression can be applied -NCI_60_metadata$`age over 50` <- NCI_60_metadata$`age a` > 50 -tissues <-NCI_60_metadata[,c(3,1)] -names(tissues) <- c("source","target") + #discretize age category so that regression can be applied + NCI_60_metadata$`age over 50` <- NCI_60_metadata$`age a` > 50 + tissues <-NCI_60_metadata[,c(3,1)] + names(tissues) <- c("source","target") -Z_matrix <- as.data.frame(model@expectations$Z$single_group) + Z_matrix <- as.data.frame(model@expectations$Z$single_group) -#Check specifically tissue meta-data association with Z matrix -tissue_enrichment <- decoupleR::run_ulm(Z_matrix, tissues) -tissue_enrichment <- reshape2::dcast(tissue_enrichment, source~condition, value.var = "score") -row.names(tissue_enrichment) <- tissue_enrichment$source -tissue_enrichment <- tissue_enrichment[,-1] + #Check specifically tissue meta-data association with Z matrix + tissue_enrichment <- decoupleR::run_ulm(Z_matrix, tissues) + tissue_enrichment <- reshape2::dcast(tissue_enrichment, source~condition, value.var = "score") + row.names(tissue_enrichment) <- tissue_enrichment$source + tissue_enrichment <- tissue_enrichment[,-1] -#plot them as heatmaps -palette <- make_heatmap_color_palette(tissue_enrichment) -pheatmap(t(tissue_enrichment), show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315,display_numbers = F, filename = "results/mofa/mofa_tissue_enrichment.pdf", width = 3, height = 2.6) -pheatmap(tissue_enrichment, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315,display_numbers = T) -``` + #plot them as heatmaps + palette <- make_heatmap_color_palette(tissue_enrichment) + pheatmap(t(tissue_enrichment), show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315,display_numbers = F, filename = "results/mofa/mofa_tissue_enrichment.pdf", width = 3, height = 2.6) + pheatmap(tissue_enrichment, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315,display_numbers = T) Here, we can see that factor 4 seems to be strongly associated with eukemia, and factor 2 with melanoma. -``` r -#format meta-data names -all_metadata <- reshape2::melt(NCI_60_metadata[,c(1,3,17,5,7,8,10,13,14)], id.vars = "cell_line") -all_metadata$variable <- gsub(" a$","",all_metadata$variable) -all_metadata$variable <- gsub(" a,c$","",all_metadata$variable) -all_metadata$variable <- gsub(" d$","",all_metadata$variable) -all_metadata$variable <- gsub("histology","histo",all_metadata$variable) -all_metadata$variable <- gsub("tissue of origin","tissue",all_metadata$variable) -all_metadata$value <- gsub("[(]81-103[)]","",all_metadata$value) -all_metadata$value <- gsub("[(]35-57[)]","",all_metadata$value) -all_metadata$value <- gsub("Memorial Sloan Kettering Cancer Center","Memorial SKCC",all_metadata$value) -all_metadata$value <- gsub("Malignant melanotic melanoma","Malig. melanotic melan.",all_metadata$value) -all_metadata$value <- gsub("Ductal carcinoma-mammary gland","Duct. carci-mam. gland",all_metadata$value) - -all_metadata$target <- paste(all_metadata$variable, all_metadata$value, sep = ": ") -all_metadata <- all_metadata[,c(4,1)] -names(all_metadata) <- c("source","target") - -Z_matrix <- as.data.frame(model@expectations$Z$single_group) - -#run ULM on the Z matrix meta data to find associated metadata features -all_metadata_enrichment <- decoupleR::run_ulm(Z_matrix, all_metadata, minsize = 3) -all_metadata_enrichment <- reshape2::dcast(all_metadata_enrichment, source~condition, value.var = "score") -row.names(all_metadata_enrichment) <- all_metadata_enrichment$source -all_metadata_enrichment <- all_metadata_enrichment[,-1] -all_metadata_enrichment_top <- all_metadata_enrichment[ - apply(all_metadata_enrichment, 1, function(x){max(abs(x)) > 2}), -] - -#Plot the heatmap -palette <- make_heatmap_color_palette(all_metadata_enrichment_top) -pheatmap(t(all_metadata_enrichment_top), show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315,display_numbers = F, filename = "results/mofa/mofa_metadata_enrichment.pdf", width = 4.5, height = 4) -pheatmap(t(all_metadata_enrichment_top), show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315,display_numbers = T) -``` + #format meta-data names + all_metadata <- reshape2::melt(NCI_60_metadata[,c(1,3,17,5,7,8,10,13,14)], id.vars = "cell_line") + all_metadata$variable <- gsub(" a$","",all_metadata$variable) + all_metadata$variable <- gsub(" a,c$","",all_metadata$variable) + all_metadata$variable <- gsub(" d$","",all_metadata$variable) + all_metadata$variable <- gsub("histology","histo",all_metadata$variable) + all_metadata$variable <- gsub("tissue of origin","tissue",all_metadata$variable) + all_metadata$value <- gsub("[(]81-103[)]","",all_metadata$value) + all_metadata$value <- gsub("[(]35-57[)]","",all_metadata$value) + all_metadata$value <- gsub("Memorial Sloan Kettering Cancer Center","Memorial SKCC",all_metadata$value) + all_metadata$value <- gsub("Malignant melanotic melanoma","Malig. melanotic melan.",all_metadata$value) + all_metadata$value <- gsub("Ductal carcinoma-mammary gland","Duct. carci-mam. gland",all_metadata$value) + + all_metadata$target <- paste(all_metadata$variable, all_metadata$value, sep = ": ") + all_metadata <- all_metadata[,c(4,1)] + names(all_metadata) <- c("source","target") + + Z_matrix <- as.data.frame(model@expectations$Z$single_group) + + #run ULM on the Z matrix meta data to find associated metadata features + all_metadata_enrichment <- decoupleR::run_ulm(Z_matrix, all_metadata, minsize = 3) + all_metadata_enrichment <- reshape2::dcast(all_metadata_enrichment, source~condition, value.var = "score") + row.names(all_metadata_enrichment) <- all_metadata_enrichment$source + all_metadata_enrichment <- all_metadata_enrichment[,-1] + all_metadata_enrichment_top <- all_metadata_enrichment[ + apply(all_metadata_enrichment, 1, function(x){max(abs(x)) > 2}), + ] + + #Plot the heatmap + palette <- make_heatmap_color_palette(all_metadata_enrichment_top) + pheatmap(t(all_metadata_enrichment_top), show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315,display_numbers = F, filename = "results/mofa/mofa_metadata_enrichment.pdf", width = 4.5, height = 4) + pheatmap(t(all_metadata_enrichment_top), show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315,display_numbers = T) Further, we can inspect the composition of the factors by plotting the top 10 weights per factor and view. -``` r -## Top 20 factor weights per view -grid.arrange( - ### RNA - plot_top_weights(model, - view = "RNA", - factor = 4, - nfeatures = 10) + - ggtitle("RNA"), - ### Proteomics - plot_top_weights(model, - view = "proteo", - factor = 4, - nfeatures = 10) + - ggtitle("Proteomics"), - ### Metabolomics - plot_top_weights(model, - view = "metab", - factor = 4, - nfeatures = 10) + - ggtitle("Metabolomics"), ncol =3 -) -``` - -![](MOFA_to_COSMOS_files/figure-gfm/Factor%20weights%20per%20view-1.png) + ## Top 20 factor weights per view + grid.arrange( + ### RNA + plot_top_weights(model, + view = "RNA", + factor = 4, + nfeatures = 10) + + ggtitle("RNA"), + ### Proteomics + plot_top_weights(model, + view = "proteo", + factor = 4, + nfeatures = 10) + + ggtitle("Proteomics"), + ### Metabolomics + plot_top_weights(model, + view = "metab", + factor = 4, + nfeatures = 10) + + ggtitle("Metabolomics"), ncol =3 + ) + +![](MOFA_to_COSMOS_files/figure-markdown_strict/Factor%20weights%20per%20view-1.png) Using this visualization, features with strong association with the factor (large absolute values) can be easily identified and further @@ -652,71 +588,63 @@ R)](https://raw.githack.com/bioFAM/MOFA2_tutorials/master/R_tutorials/downstream ### From MOFA output to TF activity, ligand-receptor activity -In the next step, using the weights of Factor 4 (highest shared $R^2$ -across views), different databases (liana and dorothea) and decoupleR, -the transcription factor activities, ligand-receptor scores are -estimated. +In the next step, using the weights of Factor 4 (highest shared +*R*2 across views), different databases (liana and dorothea) +and decoupleR, the transcription factor activities, ligand-receptor +scores are estimated. First, we have to extract the weights from our MOFA model and keep the weights from Factor 4 in the RNA view. -``` r -weights <- get_weights(model, views = "all", factors = "all") + weights <- get_weights(model, views = "all", factors = "all") -save(weights, file = "results/mofa/mofa_weights.RData") -# Extract Factor with highest explained variance in RNA view -RNA <- data.frame(weights$RNA[,4]) -row.names(RNA) <- gsub("_RNA","",row.names(RNA)) -``` + save(weights, file = "results/mofa/mofa_weights.RData") + # Extract Factor with highest explained variance in RNA view + RNA <- data.frame(weights$RNA[,4]) + row.names(RNA) <- gsub("_RNA","",row.names(RNA)) Then we load the consensus networks from our databases, as well as the TF-targets regulons from collectri. -``` r -# Load LIANA (receptor and ligand) consensus network -# ligrec_ressource <- distinct(liana::decomplexify(liana::select_resource("Consensus")[[1]])) -# save(ligrec_ressource, file = "support/ligrec_ressource.RData") + # Load LIANA (receptor and ligand) consensus network + # ligrec_ressource <- distinct(liana::decomplexify(liana::select_resource("Consensus")[[1]])) + # save(ligrec_ressource, file = "support/ligrec_ressource.RData") -#process the LR interaction prior knowledge to make suitable input for decoupleR -load(file = "support/ligrec_ressource.RData") -ligrec_geneset <- format_LR_ressource(ligrec_ressource) + #process the LR interaction prior knowledge to make suitable input for decoupleR + load(file = "support/ligrec_ressource.RData") + ligrec_geneset <- format_LR_ressource(ligrec_ressource) -# Load Dorothea (TF) network -# dorothea_df <- decoupleR::get_collectri() -# save(dorothea_df, file = "support/dorothea_df.RData") -load(file = "support/dorothea_df.RData") -``` + # Load Dorothea (TF) network + # dorothea_df <- decoupleR::get_collectri() + # save(dorothea_df, file = "support/dorothea_df.RData") + load(file = "support/dorothea_df.RData") We can display what are the top weights of the model and to which factor they are associated. -``` r -RNA_all <- as.data.frame(weights$RNA) -row.names(RNA_all) <- gsub("_RNA","",row.names(RNA_all)) + RNA_all <- as.data.frame(weights$RNA) + row.names(RNA_all) <- gsub("_RNA","",row.names(RNA_all)) -#Extract the top weights -RNA_top <- RNA_all[order(apply(RNA_all,1,function(x){max(abs(x))}), decreasing = T)[1:25],] + #Extract the top weights + RNA_top <- RNA_all[order(apply(RNA_all,1,function(x){max(abs(x))}), decreasing = T)[1:25],] -#plot them as heatmaps -palette <- make_heatmap_color_palette(RNA_top) -pheatmap(RNA_top, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315, filename = "results/mofa/mofa_top_RNA.pdf", width = 4, height = 4.3) -pheatmap(RNA_top, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315) -``` + #plot them as heatmaps + palette <- make_heatmap_color_palette(RNA_top) + pheatmap(RNA_top, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315, filename = "results/mofa/mofa_top_RNA.pdf", width = 4, height = 4.3) + pheatmap(RNA_top, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315) We can do the same for metabolites -``` r -metabs_all <- as.data.frame(weights$metab) -row.names(metabs_all) <- gsub("_metab","",row.names(metabs_all)) + metabs_all <- as.data.frame(weights$metab) + row.names(metabs_all) <- gsub("_metab","",row.names(metabs_all)) -#extract top metab weights -metabs_top <- metabs_all[order(apply(metabs_all,1,function(x){max(abs(x))}), decreasing = T)[1:10],] + #extract top metab weights + metabs_top <- metabs_all[order(apply(metabs_all,1,function(x){max(abs(x))}), decreasing = T)[1:10],] -#plot them as heatmaps -palette <- make_heatmap_color_palette(metabs_top) -pheatmap(metabs_top, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315, filename = "results/mofa/mofa_top_metabs.pdf", width = 4, height = 2.2) -pheatmap(metabs_top, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315) -``` + #plot them as heatmaps + palette <- make_heatmap_color_palette(metabs_top) + pheatmap(metabs_top, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315, filename = "results/mofa/mofa_top_metabs.pdf", width = 4, height = 2.2) + pheatmap(metabs_top, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315) Then, by using decoupleR and prior knowledge networks, we calculate the different regulatory activities inferred by the ULM approach. Depending @@ -724,73 +652,63 @@ on the task, the minimum number of targets per source varies required to maintain the source in the output (e.g. two targets by definition for ligand-receptor interactions). Here can display them as heatmaps. -``` r -# Calculate regulatory activities from Receptor and Ligand network -ligrec_factors <- run_ulm(mat = as.matrix(RNA_all), network = ligrec_geneset, .source = set, .target = gene, minsize = 2) -ligrec_factors_df <- wide_ulm_res(ligrec_factors) -ligrec_factors_df <- ligrec_factors_df[order(apply(ligrec_factors_df,1,function(x){max(abs(x))}), decreasing = T)[1:25],] + # Calculate regulatory activities from Receptor and Ligand network + ligrec_factors <- run_ulm(mat = as.matrix(RNA_all), network = ligrec_geneset, .source = set, .target = gene, minsize = 2) + ligrec_factors_df <- wide_ulm_res(ligrec_factors) + ligrec_factors_df <- ligrec_factors_df[order(apply(ligrec_factors_df,1,function(x){max(abs(x))}), decreasing = T)[1:25],] -#plot them as heatmaps -palette <- make_heatmap_color_palette(ligrec_factors_df) -pheatmap(ligrec_factors_df, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315, filename = "results/mofa/mofa_top_ligrec.pdf", width = 4, height = 4.3) -pheatmap(ligrec_factors_df, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315) -``` + #plot them as heatmaps + palette <- make_heatmap_color_palette(ligrec_factors_df) + pheatmap(ligrec_factors_df, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315, filename = "results/mofa/mofa_top_ligrec.pdf", width = 4, height = 4.3) + pheatmap(ligrec_factors_df, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315) -``` r -# Calculate regulatory activities from TF network -TF_factors <- run_ulm(mat = as.matrix(RNA_all), network = dorothea_df, minsize = 10) -TF_factors <- wide_ulm_res(TF_factors) -TF_factors <- TF_factors[order(apply(TF_factors,1,function(x){max(abs(x))}), decreasing = T)[1:25],] + # Calculate regulatory activities from TF network + TF_factors <- run_ulm(mat = as.matrix(RNA_all), network = dorothea_df, minsize = 10) + TF_factors <- wide_ulm_res(TF_factors) + TF_factors <- TF_factors[order(apply(TF_factors,1,function(x){max(abs(x))}), decreasing = T)[1:25],] -#Plot them as heatmaps -palette <- make_heatmap_color_palette(TF_factors) -pheatmap(TF_factors, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315, filename = "results/mofa/mofa_top_TF.pdf", width = 4, height = 4.3) -pheatmap(TF_factors, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315) -``` + #Plot them as heatmaps + palette <- make_heatmap_color_palette(TF_factors) + pheatmap(TF_factors, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315, filename = "results/mofa/mofa_top_TF.pdf", width = 4, height = 4.3) + pheatmap(TF_factors, show_rownames = T, cluster_cols = F, cluster_rows = F,color = palette, angle_col = 315) These activity estimations are saved in an input file for further downstream analysis as well. -``` r -# Calculate regulatory activities from Receptor and Ligand network -ligrec_high_vs_low <- run_ulm(mat = as.matrix(RNA), network = ligrec_geneset, .source = set, .target = gene, minsize = 2) -ligrec_high_vs_low <- ligrec_high_vs_low[ligrec_high_vs_low$statistic == "ulm",] -ligrec_high_vs_low_vector <- ligrec_high_vs_low$score -names(ligrec_high_vs_low_vector) <- ligrec_high_vs_low$source + # Calculate regulatory activities from Receptor and Ligand network + ligrec_high_vs_low <- run_ulm(mat = as.matrix(RNA), network = ligrec_geneset, .source = set, .target = gene, minsize = 2) + ligrec_high_vs_low <- ligrec_high_vs_low[ligrec_high_vs_low$statistic == "ulm",] + ligrec_high_vs_low_vector <- ligrec_high_vs_low$score + names(ligrec_high_vs_low_vector) <- ligrec_high_vs_low$source -ligrec_high_vs_low_top <- ligrec_high_vs_low[order(abs(ligrec_high_vs_low$score),decreasing = T)[1:15],c(2,4)] -ligrec_high_vs_low_top$source <- factor(ligrec_high_vs_low_top$source, levels = ligrec_high_vs_low_top$source) -ggplot(ligrec_high_vs_low_top, aes(x=source, y = score)) + geom_bar(stat= "identity",position = "dodge") + theme_minimal() + theme(axis.text.x = element_text(angle = 315, vjust = 0.5, hjust=0)) -``` + ligrec_high_vs_low_top <- ligrec_high_vs_low[order(abs(ligrec_high_vs_low$score),decreasing = T)[1:15],c(2,4)] + ligrec_high_vs_low_top$source <- factor(ligrec_high_vs_low_top$source, levels = ligrec_high_vs_low_top$source) + ggplot(ligrec_high_vs_low_top, aes(x=source, y = score)) + geom_bar(stat= "identity",position = "dodge") + theme_minimal() + theme(axis.text.x = element_text(angle = 315, vjust = 0.5, hjust=0)) -![](MOFA_to_COSMOS_files/figure-gfm/Activity%20estimations%20for%20factor%204-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Activity%20estimations%20for%20factor%204-1.png) -``` r -# Calculate regulatory activities from TF network -TF_high_vs_low <- run_ulm(mat = as.matrix(RNA), network = dorothea_df, minsize = 10) -TF_high_vs_low <- TF_high_vs_low[TF_high_vs_low$statistic == "ulm",] -TF_high_vs_low_vector <- TF_high_vs_low$score -names(TF_high_vs_low_vector) <- TF_high_vs_low$source + # Calculate regulatory activities from TF network + TF_high_vs_low <- run_ulm(mat = as.matrix(RNA), network = dorothea_df, minsize = 10) + TF_high_vs_low <- TF_high_vs_low[TF_high_vs_low$statistic == "ulm",] + TF_high_vs_low_vector <- TF_high_vs_low$score + names(TF_high_vs_low_vector) <- TF_high_vs_low$source -# Prepare moon analysis input -TF_high_vs_low <- as.data.frame(TF_high_vs_low[,c(2,4)]) -row.names(TF_high_vs_low) <- TF_high_vs_low[,1] + # Prepare moon analysis input + TF_high_vs_low <- as.data.frame(TF_high_vs_low[,c(2,4)]) + row.names(TF_high_vs_low) <- TF_high_vs_low[,1] -TF_high_vs_low_top_10 <- TF_high_vs_low[order(abs(TF_high_vs_low$score), decreasing = T)[1:10],] -TF_high_vs_low_top_10$source <- factor(TF_high_vs_low_top_10$source, levels = TF_high_vs_low_top_10$source) -ggplot(TF_high_vs_low_top_10, aes(x=source, y = score)) + geom_bar(stat= "identity",position = "dodge") + theme_minimal() -``` + TF_high_vs_low_top_10 <- TF_high_vs_low[order(abs(TF_high_vs_low$score), decreasing = T)[1:10],] + TF_high_vs_low_top_10$source <- factor(TF_high_vs_low_top_10$source, levels = TF_high_vs_low_top_10$source) + ggplot(TF_high_vs_low_top_10, aes(x=source, y = score)) + geom_bar(stat= "identity",position = "dodge") + theme_minimal() -![](MOFA_to_COSMOS_files/figure-gfm/Activity%20estimations%20for%20factor%204-2.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Activity%20estimations%20for%20factor%204-2.png) -``` r -# Combine results -ligrec_TF_moon_inputs <- list("ligrec" = ligrec_high_vs_low_vector, - "TF" = TF_high_vs_low_vector) + # Combine results + ligrec_TF_moon_inputs <- list("ligrec" = ligrec_high_vs_low_vector, + "TF" = TF_high_vs_low_vector) -save(ligrec_TF_moon_inputs, file = "data/cosmos/ligrec_TF_moon_inputs.Rdata") -``` + save(ligrec_TF_moon_inputs, file = "data/cosmos/ligrec_TF_moon_inputs.Rdata") We have now successfully used the MOFA weights (Factor 4, RNA view) to infer not only TF activities but also the ligand-receptor interactions. @@ -806,67 +724,61 @@ Here, in the first step, the data is loaded. Besides the activity estimations and the factor weights, the previous filtered out expressed gene names are loaded. -``` r -# Load data -load(file = "data/cosmos/ligrec_TF_moon_inputs.Rdata") + # Load data + load(file = "data/cosmos/ligrec_TF_moon_inputs.Rdata") -signaling_input <- ligrec_TF_moon_inputs$TF -ligrec_input <- ligrec_TF_moon_inputs$ligrec + signaling_input <- ligrec_TF_moon_inputs$TF + ligrec_input <- ligrec_TF_moon_inputs$ligrec -RNA_input <- weights$RNA[,4] -prot_input <- weights$proteo[,4] -metab_inputs <- weights$metab[,4] + RNA_input <- weights$RNA[,4] + prot_input <- weights$proteo[,4] + metab_inputs <- weights$metab[,4] -#remove MOFA tags from feature names -names(RNA_input) <- gsub("_RNA","",names(RNA_input)) -names(prot_input) <- gsub("_proteo","",names(prot_input)) + #remove MOFA tags from feature names + names(RNA_input) <- gsub("_RNA","",names(RNA_input)) + names(prot_input) <- gsub("_proteo","",names(prot_input)) -RNA_log2_FPKM <- as.data.frame(read_csv("data/RNA/RNA_log2_FPKM_clean.csv"))$Genes #since only complete cases were considered in the first part, here the full gene list is loaded -``` + RNA_log2_FPKM <- as.data.frame(read_csv("data/RNA/RNA_log2_FPKM_clean.csv"))$Genes #since only complete cases were considered in the first part, here the full gene list is loaded We can be interested in looking how RNA and corresponding proteins correlate in each factor. We can plot this using scatter plots for each factors. -``` r -weights_RNA <- as.data.frame(weights$RNA) -weights_prot <- as.data.frame(weights$proteo) - -weights_prot$ID <- gsub("_proteo$","",row.names(weights_prot)) -weights_RNA$ID <- gsub("_RNA$","",row.names(weights_RNA)) - -plot_list <- list() -r2_list <- list() -for(i in 1:(dim(weights_prot)[2]-1)) -{ - merged_weights <- merge(weights_RNA[,c(i,10)],weights_prot[,c(i,10)], by = "ID") - names(merged_weights) <- c("ID","RNA","prot") - r2_list[[i]] <- cor(merged_weights[,2],merged_weights[,3])^2 - plot_list[[i]] <- ggplot(merged_weights,aes(x = RNA,y = prot)) + geom_point() + - geom_smooth(method='lm', formula= y~x) + - theme_minimal() + - theme(axis.title.x=element_blank(), - axis.text.x=element_blank(), - axis.ticks.x=element_blank(), - axis.title.y=element_blank(), - axis.text.y=element_blank(), - axis.ticks.y=element_blank()) -} - -r2_vec <- unlist(r2_list) -r2_vec -``` + weights_RNA <- as.data.frame(weights$RNA) + weights_prot <- as.data.frame(weights$proteo) + + weights_prot$ID <- gsub("_proteo$","",row.names(weights_prot)) + weights_RNA$ID <- gsub("_RNA$","",row.names(weights_RNA)) + + plot_list <- list() + r2_list <- list() + for(i in 1:(dim(weights_prot)[2]-1)) + { + merged_weights <- merge(weights_RNA[,c(i,10)],weights_prot[,c(i,10)], by = "ID") + names(merged_weights) <- c("ID","RNA","prot") + r2_list[[i]] <- cor(merged_weights[,2],merged_weights[,3])^2 + plot_list[[i]] <- ggplot(merged_weights,aes(x = RNA,y = prot)) + geom_point() + + geom_smooth(method='lm', formula= y~x) + + theme_minimal() + + theme(axis.title.x=element_blank(), + axis.text.x=element_blank(), + axis.ticks.x=element_blank(), + axis.title.y=element_blank(), + axis.text.y=element_blank(), + axis.ticks.y=element_blank()) + } + + r2_vec <- unlist(r2_list) + r2_vec ## [1] 0.041804090 0.411093174 0.038741311 0.421953345 0.002666596 0.291243355 ## [7] 0.027585257 0.058177119 0.019215245 -``` r -ggsave(filename = "results/mofa/RNA_prot_cor.pdf",plot = do.call("grid.arrange", c(plot_list, ncol = 3)), device = "pdf") -``` + ggsave(filename = "results/mofa/RNA_prot_cor.pdf",plot = do.call("grid.arrange", c(plot_list, ncol = 3)), device = "pdf") ## Saving 7 x 7 in image -![](MOFA_to_COSMOS_files/figure-gfm/correlation%20RNA/prot-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/correlation%20RNA/prot-1.png) Coherently, only the factors where there is variance explained for both RNA and proteins are correlated. @@ -879,22 +791,18 @@ First, the RNA factor weights are filtered based on a threshold defined by their distribution as well a threshold defined by the distribution of the protein factor weights. -``` r -##RNA -{plot(density(RNA_input)) -abline(v = -0.2) -abline(v = 0.2)} -``` + ##RNA + {plot(density(RNA_input)) + abline(v = -0.2) + abline(v = 0.2)} -![](MOFA_to_COSMOS_files/figure-gfm/Plot:%20RNA%20weights%20&%20protein%20weights-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Plot%20RNA%20weights%20and%20protein%20weights-1.png) -``` r -{plot(density(prot_input)) -abline(v = -0.05) -abline(v = 0.05)} -``` + {plot(density(prot_input)) + abline(v = -0.05) + abline(v = 0.05)} -![](MOFA_to_COSMOS_files/figure-gfm/Plot:%20RNA%20weights%20&%20protein%20weights-2.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Plot%20RNA%20weights%20and%20protein%20weights-2.png) Based on the plots and indicated by the straight lines, the RNA weights threshold is set to -0.2 and 0.2, and the protein weights threshold to @@ -902,65 +810,57 @@ threshold is set to -0.2 and 0.2, and the protein weights threshold to set to 0 and previously filtered out genes (before MOFA analysis) are also included with value 0. -``` r -for(gene in names(RNA_input)) { - if (RNA_input[gene] > -0.2 & RNA_input[gene] < 0.2) - { - RNA_input[gene] <- 0 - } else - { - RNA_input[gene] <- sign(RNA_input[gene]) * 10 - } - if (gene %in% names(prot_input)) #will need to add a sign consistency check between RNA and prot as well - { - if (prot_input[gene] > -0.05 & prot_input[gene] < 0.05) - { - RNA_input[gene] <- 0 - } else - { - RNA_input[gene] <- sign(RNA_input[gene]) * 10 + for(gene in names(RNA_input)) { + if (RNA_input[gene] > -0.2 & RNA_input[gene] < 0.2) + { + RNA_input[gene] <- 0 + } else + { + RNA_input[gene] <- sign(RNA_input[gene]) * 10 + } + if (gene %in% names(prot_input)) #will need to add a sign consistency check between RNA and prot as well + { + if (prot_input[gene] > -0.05 & prot_input[gene] < 0.05) + { + RNA_input[gene] <- 0 + } else + { + RNA_input[gene] <- sign(RNA_input[gene]) * 10 + } + } } - } -} -RNA_log2_FPKM <- RNA_log2_FPKM[!(RNA_log2_FPKM %in% names(RNA_input))] -genes <- RNA_log2_FPKM -RNA_log2_FPKM <- rep(0,length(RNA_log2_FPKM)) -names(RNA_log2_FPKM) <- genes + RNA_log2_FPKM <- RNA_log2_FPKM[!(RNA_log2_FPKM %in% names(RNA_input))] + genes <- RNA_log2_FPKM + RNA_log2_FPKM <- rep(0,length(RNA_log2_FPKM)) + names(RNA_log2_FPKM) <- genes -RNA_input <- c(RNA_input, RNA_log2_FPKM) -``` + RNA_input <- c(RNA_input, RNA_log2_FPKM) The same procedure is repeated for the transcription factor activities. -``` r -## Transcription factors -{plot(density(signaling_input)) -abline(v = -0.5) -abline(v = 3.5)} -``` + ## Transcription factors + {plot(density(signaling_input)) + abline(v = -0.5) + abline(v = 3.5)} -![](MOFA_to_COSMOS_files/figure-gfm/Plot:%20TF%20activity%20estimates-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Plot%20TF%20activity%20estimates-1.png) The threshold is set to -0.5 and 3.5 respectively. Further, the -TF_to_remove variable is later used to remove transcription factors with -activities below this threshold from the prior knowledge network. +TF\_to\_remove variable is later used to remove transcription factors +with activities below this threshold from the prior knowledge network. -``` r -TF_to_remove <- signaling_input[signaling_input > -0.5 & signaling_input < 3.5] -signaling_input <- signaling_input[signaling_input < -0.5 | signaling_input > 3.5] -``` + TF_to_remove <- signaling_input[signaling_input > -0.5 & signaling_input < 3.5] + signaling_input <- signaling_input[signaling_input < -0.5 | signaling_input > 3.5] The same procedure is repeated for the ligand-receptor activities. -``` r -## Ligand-receptor interactions -{plot(density(ligrec_input)) -abline(v = -0.5) -abline(v = 2.5)} -``` + ## Ligand-receptor interactions + {plot(density(ligrec_input)) + abline(v = -0.5) + abline(v = 2.5)} -![](MOFA_to_COSMOS_files/figure-gfm/Plot:%20Ligand-receptor%20activity%20estimates-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Plot%20Ligand-receptor%20activity%20estimates-1.png) Here, only activities higher than 2.5 or lower than -0.5 are kept (see straight lines in plot). This is an arbitrary theshold aimed at keeping @@ -970,49 +870,43 @@ the names are adjusted accordingly and if there are multiple mentions of a receptor, its activity is calculated by the mean of the associated ligand-receptor interactions. -``` r -ligrec_input <- ligrec_input[ligrec_input > 2.5 | ligrec_input < -0.5] + ligrec_input <- ligrec_input[ligrec_input > 2.5 | ligrec_input < -0.5] -rec_inputs <- ligrec_input -names(rec_inputs) <- gsub(".+___","",names(rec_inputs)) -rec_inputs <- tapply(rec_inputs, names(rec_inputs), mean) -receptors <- names(rec_inputs) -rec_inputs <- as.numeric(rec_inputs) -names(rec_inputs) <- receptors -``` + rec_inputs <- ligrec_input + names(rec_inputs) <- gsub(".+___","",names(rec_inputs)) + rec_inputs <- tapply(rec_inputs, names(rec_inputs), mean) + receptors <- names(rec_inputs) + rec_inputs <- as.numeric(rec_inputs) + names(rec_inputs) <- receptors Finally, the same procedure is repeated for the metabolite factor weights. -``` r -## Metabolites -{plot(density(metab_inputs)) -abline(v = -0.2) -abline(v = 0.2)} -``` + ## Metabolites + {plot(density(metab_inputs)) + abline(v = -0.2) + abline(v = 0.2)} -![](MOFA_to_COSMOS_files/figure-gfm/Plot:%20Metabolite%20factor%20weights-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Plot%20Metabolite%20factor%20weights-1.png) The threshold is set to 0.2 and -0.2 respectively. Further, the -metab_to_exclude variable is later used to remove metabolites with +metab\_to\_exclude variable is later used to remove metabolites with activities below this threshold from the prior knowledge network. Moreover, metabolite names are translated into HMDB format and metabolic compartment codes (m = mitochondria, c = cytosol) as well as metab\_\_ prefix is added to HMDB IDs. More information regarding compartment codes can be found [here](http://bigg.ucsd.edu/compartments). -``` r -metab_to_HMDB <- as.data.frame( - read_csv("data/metabolomic/MetaboliteToHMDB.csv")) -metab_to_HMDB <- metab_to_HMDB[metab_to_HMDB$common %in% names(metab_inputs),] -metab_inputs <- metab_inputs[metab_to_HMDB$common] -names(metab_inputs) <- metab_to_HMDB$HMDB + metab_to_HMDB <- as.data.frame( + read_csv("data/metabolomic/MetaboliteToHMDB.csv")) + metab_to_HMDB <- metab_to_HMDB[metab_to_HMDB$common %in% names(metab_inputs),] + metab_inputs <- metab_inputs[metab_to_HMDB$common] + names(metab_inputs) <- metab_to_HMDB$HMDB -metab_inputs <- cosmosR::prepare_metab_inputs(metab_inputs, compartment_codes = c("m","c")) + metab_inputs <- cosmosR::prepare_metab_inputs(metab_inputs, compartment_codes = c("m","c")) -metab_to_exclude <- metab_inputs[abs(metab_inputs) < 0.2] -metab_inputs <- metab_inputs[abs(metab_inputs) > 0.2] -``` + metab_to_exclude <- metab_inputs[abs(metab_inputs) < 0.2] + metab_inputs <- metab_inputs[abs(metab_inputs) > 0.2] Next, the prior knowledge network (PKN) is loaded. To see how the full meta PKN was assembled, see @@ -1021,19 +915,17 @@ interested in including transcription factors and metabolites that were previously filtered out, we also remove the respective nodes from the PKN. -``` r -## Load and filter meta network -# data("meta_network") -# save(meta_network, file = "support/meta_network.RData") -load("support/meta_network.RData") + ## Load and filter meta network + # data("meta_network") + # save(meta_network, file = "support/meta_network.RData") + load("support/meta_network.RData") -meta_network <- meta_network_cleanup(meta_network) + meta_network <- meta_network_cleanup(meta_network) -meta_network <- meta_network[!(meta_network$source %in% names(TF_to_remove)) - & !(meta_network$target %in% names(TF_to_remove)),] -meta_network <- meta_network[!(meta_network$source %in% names(metab_to_exclude)) - & !(meta_network$target %in% names(metab_to_exclude)),] -``` + meta_network <- meta_network[!(meta_network$source %in% names(TF_to_remove)) + & !(meta_network$target %in% names(TF_to_remove)),] + meta_network <- meta_network[!(meta_network$source %in% names(metab_to_exclude)) + & !(meta_network$target %in% names(metab_to_exclude)),] Then the datasets are merged, entries that are not included in the PKN are removed and the filtering steps are completed. The merging in this @@ -1051,16 +943,14 @@ scale as the TF scores. Other scaling methods can be performed at the user’S discretion, this one is just a rough scaling, but good enough in this case. -``` r -# upstream_inputs <- c(rec_inputs) -# downstream_inputs <- c(metab_inputs*10, signaling_input) -# -# upstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(upstream_inputs, meta_network) -# downstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(downstream_inputs, meta_network) -# -# rec_to_TF_cosmos_input <- list(upstream_inputs_filtered, downstream_inputs_filtered) -# save(rec_to_TF_cosmos_input, file = "data/cosmos/rec_to_TF_cosmos_input.Rdata") -``` + # upstream_inputs <- c(rec_inputs) + # downstream_inputs <- c(metab_inputs*10, signaling_input) + # + # upstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(upstream_inputs, meta_network) + # downstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(downstream_inputs, meta_network) + # + # rec_to_TF_cosmos_input <- list(upstream_inputs_filtered, downstream_inputs_filtered) + # save(rec_to_TF_cosmos_input, file = "data/cosmos/rec_to_TF_cosmos_input.Rdata") ### MOON (meta footprint analysis) @@ -1074,16 +964,14 @@ In this approach, the network is compressed to avoid having redundant paths in the network leading to inflating the importance of some input measurements. -``` r -##filter expressed genes from PKN -# data("meta_network") -# save(meta_network, file = "support/meta_network.RData") -# load("support/meta_network.RData") + ##filter expressed genes from PKN + # data("meta_network") + # save(meta_network, file = "support/meta_network.RData") + # load("support/meta_network.RData") -# meta_network <- meta_network_cleanup(meta_network) + # meta_network <- meta_network_cleanup(meta_network) -expressed_genes <- as.data.frame(read_csv("data/RNA/RNA_log2_FPKM_clean.csv")) -``` + expressed_genes <- as.data.frame(read_csv("data/RNA/RNA_log2_FPKM_clean.csv")) ## Rows: 11265 Columns: 61 ## ── Column specification ──────────────────────────────────────────────────────── @@ -1094,22 +982,18 @@ expressed_genes <- as.data.frame(read_csv("data/RNA/RNA_log2_FPKM_clean.csv")) ## ℹ Use `spec()` to retrieve the full column specification for this data. ## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. -``` r -expressed_genes <- setNames(rep(1,length(expressed_genes[,1])), expressed_genes$Genes) -meta_network_filtered <- cosmosR:::filter_pkn_expressed_genes(names(expressed_genes), meta_pkn = meta_network) -``` + expressed_genes <- setNames(rep(1,length(expressed_genes[,1])), expressed_genes$Genes) + meta_network_filtered <- cosmosR:::filter_pkn_expressed_genes(names(expressed_genes), meta_pkn = meta_network) ## [1] "COSMOS: removing unexpressed nodes from PKN..." ## [1] "COSMOS: 14430 interactions removed" -``` r -##format metab inputs -metab_inputs <- as.numeric(scale(weights$metab[,4], center = F)) -names(metab_inputs) <- row.names(weights$metab) + ##format metab inputs + metab_inputs <- as.numeric(scale(weights$metab[,4], center = F)) + names(metab_inputs) <- row.names(weights$metab) -metab_to_HMDB <- as.data.frame( - read_csv("data/metabolomic/MetaboliteToHMDB.csv")) -``` + metab_to_HMDB <- as.data.frame( + read_csv("data/metabolomic/MetaboliteToHMDB.csv")) ## Rows: 139 Columns: 2 ## ── Column specification ──────────────────────────────────────────────────────── @@ -1119,108 +1003,90 @@ metab_to_HMDB <- as.data.frame( ## ℹ Use `spec()` to retrieve the full column specification for this data. ## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. -``` r -metab_to_HMDB <- metab_to_HMDB[metab_to_HMDB$common %in% names(metab_inputs),] -metab_inputs <- metab_inputs[metab_to_HMDB$common] -names(metab_inputs) <- metab_to_HMDB$HMDB + metab_to_HMDB <- metab_to_HMDB[metab_to_HMDB$common %in% names(metab_inputs),] + metab_inputs <- metab_inputs[metab_to_HMDB$common] + names(metab_inputs) <- metab_to_HMDB$HMDB -metab_inputs <- cosmosR::prepare_metab_inputs(metab_inputs, compartment_codes = c("m","c")) -``` + metab_inputs <- cosmosR::prepare_metab_inputs(metab_inputs, compartment_codes = c("m","c")) ## [1] "Adding compartment codes." -``` r -#prepare upstream inputs -upstream_inputs <- c(rec_inputs) #the upstream input should be filtered for most significant + #prepare upstream inputs + upstream_inputs <- c(rec_inputs) #the upstream input should be filtered for most significant -TF_inputs <- scale(ligrec_TF_moon_inputs$TF, center = F) -TF_inputs <- setNames(TF_inputs[,1], row.names(TF_inputs)) + TF_inputs <- scale(ligrec_TF_moon_inputs$TF, center = F) + TF_inputs <- setNames(TF_inputs[,1], row.names(TF_inputs)) -downstream_inputs <- c(metab_inputs, TF_inputs) #the downstream input should be complete + downstream_inputs <- c(metab_inputs, TF_inputs) #the downstream input should be complete -upstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(upstream_inputs, meta_network_filtered) -``` + upstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(upstream_inputs, meta_network_filtered) ## [1] "COSMOS: 5 input/measured nodes are not in PKN any more: BCAM, CD63, LRP10, NOTCH1, RXRA and 0 more." -``` r -downstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(downstream_inputs, meta_network_filtered) -``` + downstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(downstream_inputs, meta_network_filtered) ## [1] "COSMOS: 429 input/measured nodes are not in PKN any more: Metab__HMDB0011747_m, Metab__HMDB0000755_m, Metab__HMDB0001294_m, Metab__HMDB0001508_m, Metab__HMDB0000012_m, Metab__HMDB0001191_m and 423 more." -``` r -#Filter inputs and prune the meta_network to only keep nodes that can be found downstream of the inputs -#The number of step is quite flexible, 7 steps already covers most of the network + #Filter inputs and prune the meta_network to only keep nodes that can be found downstream of the inputs + #The number of step is quite flexible, 7 steps already covers most of the network -n_steps <- 6 + n_steps <- 6 -# in this step we prune the network to keep only the relevant part between upstream and downstream nodes -meta_network_filtered <- cosmosR:::keep_controllable_neighbours(meta_network_filtered - , n_steps, - names(upstream_inputs_filtered)) -``` + # in this step we prune the network to keep only the relevant part between upstream and downstream nodes + meta_network_filtered <- cosmosR:::keep_controllable_neighbours(meta_network_filtered + , n_steps, + names(upstream_inputs_filtered)) ## [1] "COSMOS: removing nodes that are not reachable from inputs within 6 steps" ## [1] "COSMOS: 29459 from 42695 interactions are removed from the PKN" -``` r -downstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(downstream_inputs_filtered, meta_network_filtered) -``` + downstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(downstream_inputs_filtered, meta_network_filtered) ## [1] "COSMOS: 28 input/measured nodes are not in PKN any more: Metab__HMDB0000168_m, Metab__HMDB0000056_m, Metab__HMDB0001051_m, Metab__HMDB0000139_m, Metab__HMDB0000687_m, Metab__HMDB0000696_m and 22 more." -``` r -meta_network_filtered <- cosmosR:::keep_observable_neighbours(meta_network_filtered, n_steps, names(downstream_inputs_filtered)) -``` + meta_network_filtered <- cosmosR:::keep_observable_neighbours(meta_network_filtered, n_steps, names(downstream_inputs_filtered)) ## [1] "COSMOS: removing nodes that are not observable by measurements within 6 steps" ## [1] "COSMOS: 6200 from 13236 interactions are removed from the PKN" -``` r -upstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(upstream_inputs_filtered, meta_network_filtered) -``` + upstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(upstream_inputs_filtered, meta_network_filtered) ## [1] "COSMOS: 4 input/measured nodes are not in PKN any more: EPHA6, GPC1, SDC1, SIRPA and 0 more." -``` r -write_csv(meta_network_filtered, file = "results/cosmos/moon/meta_network_filtered.csv") + write_csv(meta_network_filtered, file = "results/cosmos/moon/meta_network_filtered.csv") -#compress the network to avoid redundant master controllers -meta_network_compressed_list <- compress_same_children(meta_network_filtered, sig_input = upstream_inputs_filtered, metab_input = downstream_inputs_filtered) + #compress the network to avoid redundant master controllers + meta_network_compressed_list <- compress_same_children(meta_network_filtered, sig_input = upstream_inputs_filtered, metab_input = downstream_inputs_filtered) -meta_network_compressed <- meta_network_compressed_list$compressed_network + meta_network_compressed <- meta_network_compressed_list$compressed_network -meta_network_compressed <- meta_network_cleanup(meta_network_compressed) + meta_network_compressed <- meta_network_cleanup(meta_network_compressed) -load(file = "support/dorothea_df.RData") + load(file = "support/dorothea_df.RData") -# RNA_input <- as.numeric(weights$RNA[,4]) -# names(RNA_input) <- gsub("_RNA$","",row.names(weights$RNA)) -``` + # RNA_input <- as.numeric(weights$RNA[,4]) + # names(RNA_input) <- gsub("_RNA$","",row.names(weights$RNA)) In this step, we format the MOFA weight and activity score information so that it can be appended to the moon network result afterward. -``` r -data("HMDB_mapper_vec") + data("HMDB_mapper_vec") -## MOFA weights -MOFA_weights <- get_weights(model, factors = 4, as.data.frame = T) -MOFA_weights <- MOFA_weights %>% - arrange(feature, view) %>% - filter(!duplicated(feature)) -MOFA_weights <- MOFA_weights[,c(1,3)] -colnames(MOFA_weights) <- c("Nodes", "mofa_weights") + ## MOFA weights + MOFA_weights <- get_weights(model, factors = 4, as.data.frame = T) + MOFA_weights <- MOFA_weights %>% + arrange(feature, view) %>% + filter(!duplicated(feature)) + MOFA_weights <- MOFA_weights[,c(1,3)] + colnames(MOFA_weights) <- c("Nodes", "mofa_weights") -#remove RNA and proteo tags and integrate values -MOFA_weights$Nodes <- gsub("_RNA$","",MOFA_weights$Nodes) -MOFA_weights$Nodes <- gsub("_proteo$","",MOFA_weights$Nodes) -MOFA_weights$sign <- sign(MOFA_weights$mofa_weights) -MOFA_weights$mofa_weights <- abs(MOFA_weights$mofa_weights) + #remove RNA and proteo tags and integrate values + MOFA_weights$Nodes <- gsub("_RNA$","",MOFA_weights$Nodes) + MOFA_weights$Nodes <- gsub("_proteo$","",MOFA_weights$Nodes) + MOFA_weights$sign <- sign(MOFA_weights$mofa_weights) + MOFA_weights$mofa_weights <- abs(MOFA_weights$mofa_weights) -MOFA_weights <- MOFA_weights %>% group_by(Nodes) %>% summarise_each(funs(max(., na.rm = TRUE))) -``` + MOFA_weights <- MOFA_weights %>% group_by(Nodes) %>% summarise_each(funs(max(., na.rm = TRUE))) ## Warning: `summarise_each()` was deprecated in dplyr 0.7.0. ## ℹ Please use `across()` instead. @@ -1238,80 +1104,76 @@ MOFA_weights <- MOFA_weights %>% group_by(Nodes) %>% summarise_each(funs(max(., ## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was ## generated. -``` r -MOFA_weights <- as.data.frame(MOFA_weights) -MOFA_weights$mofa_weights <- MOFA_weights$mofa_weights * MOFA_weights$sign -MOFA_weights <- MOFA_weights[,-3] - -#make metabolite names coherent between mofa and cosmos -#for now i just add the metab input that is already formatted -#I will miss some weights that were not input but will fix that at later time -MOFA_weights_metabaddon <- as.data.frame(metab_inputs) -names(MOFA_weights_metabaddon) <- "mofa_weights" -MOFA_weights_metabaddon$Nodes <- row.names(MOFA_weights_metabaddon) - -MOFA_weights_metabaddon[, 2] <- sapply(MOFA_weights_metabaddon[, 2], function(x, HMDB_mapper_vec) { - x <- gsub("Metab__", "", x) - suffixe <- stringr::str_extract(x, "_[a-z]$") - x <- gsub("_[a-z]$", "", x) - if (x %in% names(HMDB_mapper_vec)) { - x <- HMDB_mapper_vec[x] - x <- paste("Metab__", x, sep = "") - } - if (!is.na(suffixe)) { - x <- paste(x, suffixe, sep = "") - } - return(x) - }, HMDB_mapper_vec = HMDB_mapper_vec) - -MOFA_weights <- as.data.frame(rbind(MOFA_weights, MOFA_weights_metabaddon)) - - -## Ligands -lig_weights <- ligrec_TF_moon_inputs$ligrec -names(lig_weights) <- gsub("___.+","",names(lig_weights)) -lig_weights <- tapply(lig_weights, names(lig_weights), mean) -ligands <- names(lig_weights) -lig_weights <- as.numeric(lig_weights) -names(lig_weights) <- ligands -lig_weights <- data.frame(Nodes = names(lig_weights), feature_weights = lig_weights) - -## Receptors -rec_weights <- ligrec_TF_moon_inputs$ligrec -names(rec_weights) <- gsub(".+___","",names(rec_weights)) -rec_weights <- tapply(rec_weights, names(rec_weights), mean) -receptors <- names(rec_weights) -rec_weights <- as.numeric(rec_weights) -names(rec_weights) <- receptors -rec_weights <- data.frame(Nodes = names(rec_weights), feature_weights = rec_weights) - -## TF -TF_weights <- ligrec_TF_moon_inputs$TF -names(TF_weights) <- gsub("_TF","",names(TF_weights)) -TF_weights <- TF_weights[!(names(TF_weights) %in% c(rec_weights$Nodes,lig_weights$Nodes))] -TF_weights <- data.frame(Nodes = names(TF_weights), feature_weights = TF_weights) - - - -## Combine -feature_weights <- as.data.frame(rbind(TF_weights, rec_weights, lig_weights)) -``` + MOFA_weights <- as.data.frame(MOFA_weights) + MOFA_weights$mofa_weights <- MOFA_weights$mofa_weights * MOFA_weights$sign + MOFA_weights <- MOFA_weights[,-3] + + #make metabolite names coherent between mofa and cosmos + #for now i just add the metab input that is already formatted + #I will miss some weights that were not input but will fix that at later time + MOFA_weights_metabaddon <- as.data.frame(metab_inputs) + names(MOFA_weights_metabaddon) <- "mofa_weights" + MOFA_weights_metabaddon$Nodes <- row.names(MOFA_weights_metabaddon) + + MOFA_weights_metabaddon[, 2] <- sapply(MOFA_weights_metabaddon[, 2], function(x, HMDB_mapper_vec) { + x <- gsub("Metab__", "", x) + suffixe <- stringr::str_extract(x, "_[a-z]$") + x <- gsub("_[a-z]$", "", x) + if (x %in% names(HMDB_mapper_vec)) { + x <- HMDB_mapper_vec[x] + x <- paste("Metab__", x, sep = "") + } + if (!is.na(suffixe)) { + x <- paste(x, suffixe, sep = "") + } + return(x) + }, HMDB_mapper_vec = HMDB_mapper_vec) + + MOFA_weights <- as.data.frame(rbind(MOFA_weights, MOFA_weights_metabaddon)) + + + ## Ligands + lig_weights <- ligrec_TF_moon_inputs$ligrec + names(lig_weights) <- gsub("___.+","",names(lig_weights)) + lig_weights <- tapply(lig_weights, names(lig_weights), mean) + ligands <- names(lig_weights) + lig_weights <- as.numeric(lig_weights) + names(lig_weights) <- ligands + lig_weights <- data.frame(Nodes = names(lig_weights), feature_weights = lig_weights) + + ## Receptors + rec_weights <- ligrec_TF_moon_inputs$ligrec + names(rec_weights) <- gsub(".+___","",names(rec_weights)) + rec_weights <- tapply(rec_weights, names(rec_weights), mean) + receptors <- names(rec_weights) + rec_weights <- as.numeric(rec_weights) + names(rec_weights) <- receptors + rec_weights <- data.frame(Nodes = names(rec_weights), feature_weights = rec_weights) + + ## TF + TF_weights <- ligrec_TF_moon_inputs$TF + names(TF_weights) <- gsub("_TF","",names(TF_weights)) + TF_weights <- TF_weights[!(names(TF_weights) %in% c(rec_weights$Nodes,lig_weights$Nodes))] + TF_weights <- data.frame(Nodes = names(TF_weights), feature_weights = TF_weights) + + + + ## Combine + feature_weights <- as.data.frame(rbind(TF_weights, rec_weights, lig_weights)) To also identify whether a node is a ligand or receptor we also load the liana consensus LR ressource. -``` r -# LIANA -load("support/ligrec_ressource.RData") -ligrec_df <- ligrec_ressource[,c("source_genesymbol","target_genesymbol")] -ligrec_df <- distinct(ligrec_df) -names(ligrec_df) <- c("Node1","Node2") + # LIANA + load("support/ligrec_ressource.RData") + ligrec_df <- ligrec_ressource[,c("source_genesymbol","target_genesymbol")] + ligrec_df <- distinct(ligrec_df) + names(ligrec_df) <- c("Node1","Node2") -ligrec_df$Node1 <- gsub("-","_",ligrec_df$Node1) -ligrec_df$Node2 <- gsub("-","_",ligrec_df$Node2) -ligrec_df$Sign <- 1 -ligrec_df$Weight <- 1 -``` + ligrec_df$Node1 <- gsub("-","_",ligrec_df$Node1) + ligrec_df$Node2 <- gsub("-","_",ligrec_df$Node2) + ligrec_df$Sign <- 1 + ligrec_df$Weight <- 1 In this step. we run the first moon analysis, to connect receptors with downstream TFs and ligands. MOON function is run iterativelly in a loop @@ -1319,302 +1181,266 @@ until the resulting scored network does not include edges incoherent with the RNA and proteomic data measurents. then, the network is decompressed, IDs are translated in a human-friendly format and a subnetowrk comparable to the classic cosmos solution network is -generated with the reduce_solution_network function. - -``` r -meta_network_rec_to_TFmetab <- meta_network_compressed - -#We run moon in a loop until TF-target coherence convergences -before <- 1 -after <- 0 -i <- 1 -while (before != after & i < 10) { - before <- length(meta_network_rec_to_TFmetab[,1]) - start_time <- Sys.time() - moon_res <- cosmosR::moon(upstream_input = upstream_inputs_filtered, - downstream_input = downstream_inputs_filtered, - meta_network = meta_network_rec_to_TFmetab, - n_layers = n_steps, - statistic = "ulm") - - meta_network_rec_to_TFmetab <- filter_incohrent_TF_target(moon_res, dorothea_df, meta_network_rec_to_TFmetab, RNA_input) - end_time <- Sys.time() - - print(end_time - start_time) - - after <- length(meta_network_rec_to_TFmetab[,1]) - i <- i + 1 -} -``` +generated with the reduce\_solution\_network function. + + meta_network_rec_to_TFmetab <- meta_network_compressed + + #We run moon in a loop until TF-target coherence convergences + before <- 1 + after <- 0 + i <- 1 + while (before != after & i < 10) { + before <- length(meta_network_rec_to_TFmetab[,1]) + start_time <- Sys.time() + moon_res <- cosmosR::moon(upstream_input = upstream_inputs_filtered, + downstream_input = downstream_inputs_filtered, + meta_network = meta_network_rec_to_TFmetab, + n_layers = n_steps, + statistic = "ulm") + + meta_network_rec_to_TFmetab <- filter_incohrent_TF_target(moon_res, dorothea_df, meta_network_rec_to_TFmetab, RNA_input) + end_time <- Sys.time() + + print(end_time - start_time) + + after <- length(meta_network_rec_to_TFmetab[,1]) + i <- i + 1 + } ## [1] 2 ## [1] 3 ## [1] 4 ## [1] 5 ## [1] 6 - ## Time difference of 0.1666861 secs + ## Time difference of 0.1765909 secs ## [1] 2 ## [1] 3 ## [1] 4 ## [1] 5 ## [1] 6 - ## Time difference of 0.2143788 secs - -``` r -if(i < 10) -{ - print(paste("Converged after ",paste(i-1," iterations", sep = ""),sep = "")) -} else -{ - print(paste("Interupted after ",paste(i," iterations. Convergence uncertain.", sep = ""),sep = "")) -} -``` + ## Time difference of 0.2086551 secs + + if(i < 10) + { + print(paste("Converged after ",paste(i-1," iterations", sep = ""),sep = "")) + } else + { + print(paste("Interupted after ",paste(i," iterations. Convergence uncertain.", sep = ""),sep = "")) + } ## [1] "Converged after 2 iterations" -``` r -moon_res <- decompress_moon_result(moon_res, meta_network_compressed_list, meta_network_rec_to_TFmetab) + moon_res <- decompress_moon_result(moon_res, meta_network_compressed_list, meta_network_rec_to_TFmetab) -#save the whole res for later -moon_res_rec_to_TFmet <- moon_res[,c(4,2,3)] -moon_res_rec_to_TFmet[,1] <- translate_column_HMDB(moon_res_rec_to_TFmet[,1], HMDB_mapper_vec) + #save the whole res for later + moon_res_rec_to_TFmet <- moon_res[,c(4,2,3)] + moon_res_rec_to_TFmet[,1] <- translate_column_HMDB(moon_res_rec_to_TFmet[,1], HMDB_mapper_vec) -levels <- moon_res[,c(4,3)] + levels <- moon_res[,c(4,3)] -moon_res <- moon_res[,c(4,2)] -names(moon_res)[1] <- "source" + moon_res <- moon_res[,c(4,2)] + names(moon_res)[1] <- "source" -plot(density(moon_res$score)) -abline(v = 1) -abline(v = -1) -``` + plot(density(moon_res$score)) + abline(v = 1) + abline(v = -1) -![](MOFA_to_COSMOS_files/figure-gfm/Run%20moon%20rec_to_TFmetab-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/Run%20moon%20rec_to_TFmetab-1.png) -``` r -solution_network <- reduce_solution_network(decoupleRnival_res = moon_res, - meta_network = meta_network_filtered, - cutoff = 1.5, - upstream_input = upstream_inputs_filtered, - RNA_input = RNA_input, - n_steps = n_steps) -``` + solution_network <- reduce_solution_network(decoupleRnival_res = moon_res, + meta_network = meta_network_filtered, + cutoff = 1.5, + upstream_input = upstream_inputs_filtered, + RNA_input = RNA_input, + n_steps = n_steps) ## [1] "COSMOS: removing nodes that are not reachable from inputs within 6 steps" ## [1] "COSMOS: 385 from 901 interactions are removed from the PKN" -``` r -SIF_rec_to_TFmetab <- solution_network$SIF -names(SIF_rec_to_TFmetab)[3] <- "sign" -ATT_rec_to_TFmetab <- solution_network$ATT + SIF_rec_to_TFmetab <- solution_network$SIF + names(SIF_rec_to_TFmetab)[3] <- "sign" + ATT_rec_to_TFmetab <- solution_network$ATT -translated_res <- translate_res(SIF_rec_to_TFmetab,ATT_rec_to_TFmetab,HMDB_mapper_vec) + translated_res <- translate_res(SIF_rec_to_TFmetab,ATT_rec_to_TFmetab,HMDB_mapper_vec) -levels_translated <- translate_res(SIF_rec_to_TFmetab,levels,HMDB_mapper_vec)[[2]] + levels_translated <- translate_res(SIF_rec_to_TFmetab,levels,HMDB_mapper_vec)[[2]] -SIF_rec_to_TFmetab <- translated_res[[1]] -ATT_rec_to_TFmetab <- translated_res[[2]] + SIF_rec_to_TFmetab <- translated_res[[1]] + ATT_rec_to_TFmetab <- translated_res[[2]] -## Add weights to data -ATT_rec_to_TFmetab <- merge(ATT_rec_to_TFmetab, MOFA_weights, all.x = T) -ATT_rec_to_TFmetab <- merge(ATT_rec_to_TFmetab, feature_weights, all.x = T) -names(ATT_rec_to_TFmetab)[2] <- "AvgAct" + ## Add weights to data + ATT_rec_to_TFmetab <- merge(ATT_rec_to_TFmetab, MOFA_weights, all.x = T) + ATT_rec_to_TFmetab <- merge(ATT_rec_to_TFmetab, feature_weights, all.x = T) + names(ATT_rec_to_TFmetab)[2] <- "AvgAct" -ATT_rec_to_TFmetab$NodeType <- ifelse(ATT_rec_to_TFmetab$Nodes %in% levels_translated[levels_translated$level == 0,1],1,0) + ATT_rec_to_TFmetab$NodeType <- ifelse(ATT_rec_to_TFmetab$Nodes %in% levels_translated[levels_translated$level == 0,1],1,0) -ATT_rec_to_TFmetab$NodeType <- ifelse(ATT_rec_to_TFmetab$Nodes %in% ligrec_df$Node1, 2, ifelse(ATT_rec_to_TFmetab$Nodes %in% ligrec_df$Node2, 3, ifelse(ATT_rec_to_TFmetab$Nodes %in% dorothea_df$source, 4, ATT_rec_to_TFmetab$NodeType))) + ATT_rec_to_TFmetab$NodeType <- ifelse(ATT_rec_to_TFmetab$Nodes %in% ligrec_df$Node1, 2, ifelse(ATT_rec_to_TFmetab$Nodes %in% ligrec_df$Node2, 3, ifelse(ATT_rec_to_TFmetab$Nodes %in% dorothea_df$source, 4, ATT_rec_to_TFmetab$NodeType))) -names(SIF_rec_to_TFmetab)[4] <- "Weight" + names(SIF_rec_to_TFmetab)[4] <- "Weight" -write_csv(SIF_rec_to_TFmetab,file = "results/cosmos/moon/SIF_rec_TFmetab.csv") -write_csv(ATT_rec_to_TFmetab,file = "results/cosmos/moon/ATT_rec_TFmetab.csv") -``` + write_csv(SIF_rec_to_TFmetab,file = "results/cosmos/moon/SIF_rec_TFmetab.csv") + write_csv(ATT_rec_to_TFmetab,file = "results/cosmos/moon/ATT_rec_TFmetab.csv") Next, we also score a network connecting the TF with downstream ligands. The procedure is the same as the previous section, only the upstream and downstream input definition is changing. -``` r -##filter expressed genes from PKN -dorothea_PKN <- dorothea_df[,c(1,3,2)] -names(dorothea_PKN)[2] <- "interaction" + ##filter expressed genes from PKN + dorothea_PKN <- dorothea_df[,c(1,3,2)] + names(dorothea_PKN)[2] <- "interaction" -dorothea_PKN_filtered <- cosmosR:::filter_pkn_expressed_genes(names(expressed_genes), meta_pkn = dorothea_PKN) -``` + dorothea_PKN_filtered <- cosmosR:::filter_pkn_expressed_genes(names(expressed_genes), meta_pkn = dorothea_PKN) ## [1] "COSMOS: removing unexpressed nodes from PKN..." ## [1] "COSMOS: 12053 interactions removed" -``` r -#prepare upstream inputs + #prepare upstream inputs -#the upstream input should be filtered for most significant -upstream_inputs <- setNames(TF_weights$feature_weights, TF_weights$Nodes) -upstream_inputs <- upstream_inputs[abs(upstream_inputs) > 2] + #the upstream input should be filtered for most significant + upstream_inputs <- setNames(TF_weights$feature_weights, TF_weights$Nodes) + upstream_inputs <- upstream_inputs[abs(upstream_inputs) > 2] -lig_inputs <- ligrec_TF_moon_inputs$ligrec -names(lig_inputs) <- gsub("___.+","",names(lig_inputs)) -lig_inputs <- tapply(lig_inputs, names(lig_inputs), mean) -ligands <- names(lig_inputs) -lig_inputs <- as.numeric(lig_inputs) -names(lig_inputs) <- ligands + lig_inputs <- ligrec_TF_moon_inputs$ligrec + names(lig_inputs) <- gsub("___.+","",names(lig_inputs)) + lig_inputs <- tapply(lig_inputs, names(lig_inputs), mean) + ligands <- names(lig_inputs) + lig_inputs <- as.numeric(lig_inputs) + names(lig_inputs) <- ligands -lig_inputs <- scale(lig_inputs, center = F) -lig_inputs <- setNames(lig_inputs[,1], row.names(lig_inputs)) + lig_inputs <- scale(lig_inputs, center = F) + lig_inputs <- setNames(lig_inputs[,1], row.names(lig_inputs)) -downstream_inputs <- lig_inputs #the downstream input should be complete + downstream_inputs <- lig_inputs #the downstream input should be complete -upstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(upstream_inputs, dorothea_PKN_filtered) -``` + upstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(upstream_inputs, dorothea_PKN_filtered) ## [1] "COSMOS: 22 input/measured nodes are not in PKN any more: SPI1, NR0B2, ESR1, AR, NKX2-5, RARB and 16 more." -``` r -downstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(downstream_inputs, dorothea_PKN_filtered) -``` + downstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(downstream_inputs, dorothea_PKN_filtered) ## [1] "COSMOS: 19 input/measured nodes are not in PKN any more: ACTR2, ADAM9, AGRN, ARPC5, EFNA3, EFNB1 and 13 more." -``` r -#Filter inputs and prune the meta_network to only keep nodes that can be found downstream of the inputs -#The number of step is quite flexible, 7 steps already covers most of the network + #Filter inputs and prune the meta_network to only keep nodes that can be found downstream of the inputs + #The number of step is quite flexible, 7 steps already covers most of the network -n_steps <- 1 + n_steps <- 1 -# in this step we prune the network to keep only the relevant part between upstream and downstream nodes -dorothea_PKN_filtered <- cosmosR:::keep_controllable_neighbours(dorothea_PKN_filtered - , n_steps, - names(upstream_inputs_filtered)) -``` + # in this step we prune the network to keep only the relevant part between upstream and downstream nodes + dorothea_PKN_filtered <- cosmosR:::keep_controllable_neighbours(dorothea_PKN_filtered + , n_steps, + names(upstream_inputs_filtered)) ## [1] "COSMOS: removing nodes that are not reachable from inputs within 1 steps" ## [1] "COSMOS: 1896 from 17542 interactions are removed from the PKN" -``` r -downstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(downstream_inputs_filtered, dorothea_PKN_filtered) -``` + downstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(downstream_inputs_filtered, dorothea_PKN_filtered) ## [1] "COSMOS: 9 input/measured nodes are not in PKN any more: BMP1, ETV5, GAS6, GDF11, ITGB3BP, MAML2 and 3 more." -``` r -dorothea_PKN_filtered <- cosmosR:::keep_observable_neighbours(dorothea_PKN_filtered, n_steps, names(downstream_inputs_filtered)) -``` + dorothea_PKN_filtered <- cosmosR:::keep_observable_neighbours(dorothea_PKN_filtered, n_steps, names(downstream_inputs_filtered)) ## [1] "COSMOS: removing nodes that are not observable by measurements within 1 steps" ## [1] "COSMOS: 12699 from 15646 interactions are removed from the PKN" -``` r -upstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(upstream_inputs_filtered, dorothea_PKN_filtered) -``` + upstream_inputs_filtered <- cosmosR:::filter_input_nodes_not_in_pkn(upstream_inputs_filtered, dorothea_PKN_filtered) ## [1] "COSMOS: 7 input/measured nodes are not in PKN any more: MTF1, NR2C2, RBPJ, SRF, NCOA2, E2F2 and 1 more." -``` r -# load(file = "support/dorothea_df.RData") - -# RNA_input <- as.numeric(weights$RNA[,4]) -# names(RNA_input) <- gsub("_RNA$","",row.names(weights$RNA)) -``` - -``` r -meta_network_TF_lig <- dorothea_PKN_filtered - -write_csv(meta_network_TF_lig, file = "results/cosmos/moon/meta_network_TF_lig.csv") - -before <- 1 -after <- 0 -i <- 1 -while (before != after & i < 10) { - before <- length(meta_network_TF_lig[,1]) - moon_res <- cosmosR::moon(upstream_input = upstream_inputs_filtered, - downstream_input = downstream_inputs_filtered, - meta_network = meta_network_TF_lig, - n_layers = n_steps, - statistic = "ulm") - - meta_network_TF_lig <- filter_incohrent_TF_target(moon_res, dorothea_df, meta_network_TF_lig, RNA_input) - after <- length(meta_network_TF_lig[,1]) - i <- i + 1 -} - -if(i < 10) -{ - print(paste("Converged after ",paste(i-1," iterations", sep = ""),sep = "")) -} else -{ - print(paste("Interupted after ",paste(i," iterations. Convergence uncertain.", sep = ""),sep = "")) -} -``` + # load(file = "support/dorothea_df.RData") + + # RNA_input <- as.numeric(weights$RNA[,4]) + # names(RNA_input) <- gsub("_RNA$","",row.names(weights$RNA)) + + meta_network_TF_lig <- dorothea_PKN_filtered + + write_csv(meta_network_TF_lig, file = "results/cosmos/moon/meta_network_TF_lig.csv") + + before <- 1 + after <- 0 + i <- 1 + while (before != after & i < 10) { + before <- length(meta_network_TF_lig[,1]) + moon_res <- cosmosR::moon(upstream_input = upstream_inputs_filtered, + downstream_input = downstream_inputs_filtered, + meta_network = meta_network_TF_lig, + n_layers = n_steps, + statistic = "ulm") + + meta_network_TF_lig <- filter_incohrent_TF_target(moon_res, dorothea_df, meta_network_TF_lig, RNA_input) + after <- length(meta_network_TF_lig[,1]) + i <- i + 1 + } + + if(i < 10) + { + print(paste("Converged after ",paste(i-1," iterations", sep = ""),sep = "")) + } else + { + print(paste("Interupted after ",paste(i," iterations. Convergence uncertain.", sep = ""),sep = "")) + } ## [1] "Converged after 1 iterations" -``` r -moon_res_TF_lig <- moon_res -moon_res_TF_lig[,1] <- translate_column_HMDB(moon_res_TF_lig[,1], HMDB_mapper_vec) + moon_res_TF_lig <- moon_res + moon_res_TF_lig[,1] <- translate_column_HMDB(moon_res_TF_lig[,1], HMDB_mapper_vec) -levels <- moon_res[,c(1,3)] + levels <- moon_res[,c(1,3)] -moon_res <- moon_res[,c(1,2)] -names(moon_res)[1] <- "source" + moon_res <- moon_res[,c(1,2)] + names(moon_res)[1] <- "source" -plot(density(moon_res$score)) -abline(v = 1) -abline(v = -1) -``` + plot(density(moon_res$score)) + abline(v = 1) + abline(v = -1) -![](MOFA_to_COSMOS_files/figure-gfm/run%20moon%20TF%20to%20lig-1.png) +![](MOFA_to_COSMOS_files/figure-markdown_strict/run%20moon%20TF%20to%20lig-1.png) -``` r -solution_network <- reduce_solution_network(decoupleRnival_res = moon_res, - meta_network = as.data.frame(dorothea_PKN_filtered[,c(1,3,2)]), - cutoff = 1.5, - upstream_input = upstream_inputs_filtered, - RNA_input = RNA_input, - n_steps = n_steps) -``` + solution_network <- reduce_solution_network(decoupleRnival_res = moon_res, + meta_network = as.data.frame(dorothea_PKN_filtered[,c(1,3,2)]), + cutoff = 1.5, + upstream_input = upstream_inputs_filtered, + RNA_input = RNA_input, + n_steps = n_steps) ## [1] "COSMOS: removing nodes that are not reachable from inputs within 1 steps" ## [1] "COSMOS: 29 from 143 interactions are removed from the PKN" -``` r -SIF_TF_lig <- solution_network$SIF -names(SIF_TF_lig)[3] <- "sign" -ATT_TF_lig <- solution_network$ATT + SIF_TF_lig <- solution_network$SIF + names(SIF_TF_lig)[3] <- "sign" + ATT_TF_lig <- solution_network$ATT -translated_res <- translate_res(SIF_TF_lig,ATT_TF_lig,HMDB_mapper_vec) + translated_res <- translate_res(SIF_TF_lig,ATT_TF_lig,HMDB_mapper_vec) -levels_translated <- translate_res(SIF_TF_lig,levels,HMDB_mapper_vec)[[2]] + levels_translated <- translate_res(SIF_TF_lig,levels,HMDB_mapper_vec)[[2]] -SIF_TF_lig <- translated_res[[1]] -ATT_TF_lig <- translated_res[[2]] + SIF_TF_lig <- translated_res[[1]] + ATT_TF_lig <- translated_res[[2]] -## Add weights to data -ATT_TF_lig <- merge(ATT_TF_lig, MOFA_weights, all.x = T) -ATT_TF_lig <- merge(ATT_TF_lig, feature_weights, all.x = T) -names(ATT_TF_lig)[2] <- "AvgAct" + ## Add weights to data + ATT_TF_lig <- merge(ATT_TF_lig, MOFA_weights, all.x = T) + ATT_TF_lig <- merge(ATT_TF_lig, feature_weights, all.x = T) + names(ATT_TF_lig)[2] <- "AvgAct" -ATT_TF_lig$NodeType <- ifelse(ATT_TF_lig$Nodes %in% levels_translated[levels_translated$level == 0,1],1,0) + ATT_TF_lig$NodeType <- ifelse(ATT_TF_lig$Nodes %in% levels_translated[levels_translated$level == 0,1],1,0) -ATT_TF_lig$NodeType <- ifelse(ATT_TF_lig$Nodes %in% ligrec_df$Node1, 2, ifelse(ATT_TF_lig$Nodes %in% ligrec_df$Node2, 3, ifelse(ATT_TF_lig$Nodes %in% dorothea_df$source, 4, ATT_TF_lig$NodeType))) + ATT_TF_lig$NodeType <- ifelse(ATT_TF_lig$Nodes %in% ligrec_df$Node1, 2, ifelse(ATT_TF_lig$Nodes %in% ligrec_df$Node2, 3, ifelse(ATT_TF_lig$Nodes %in% dorothea_df$source, 4, ATT_TF_lig$NodeType))) -names(SIF_TF_lig)[4] <- "Weight" + names(SIF_TF_lig)[4] <- "Weight" -write_csv(SIF_TF_lig,file = "results/cosmos/moon/SIF_TF_lig.csv") -write_csv(ATT_TF_lig,file = "results/cosmos/moon/ATT_TF_lig.csv") -``` + write_csv(SIF_TF_lig,file = "results/cosmos/moon/SIF_TF_lig.csv") + write_csv(ATT_TF_lig,file = "results/cosmos/moon/ATT_TF_lig.csv") -Both networks (Receptos -\> TF and metabs; TF -\> Ligands) are merged -together into a single compbined scored network. We also add the +Both networks (Receptos -> TF and metabs; TF -> Ligands) are +merged together into a single compbined scored network. We also add the conections between ligands and their corresponding receptors. -``` r -combined_SIF_moon <- as.data.frame(rbind(SIF_rec_to_TFmetab, SIF_TF_lig)) -combined_SIF_moon <- unique(combined_SIF_moon) + combined_SIF_moon <- as.data.frame(rbind(SIF_rec_to_TFmetab, SIF_TF_lig)) + combined_SIF_moon <- unique(combined_SIF_moon) -combined_ATT_moon <- as.data.frame(rbind(ATT_rec_to_TFmetab, ATT_TF_lig)) -combined_ATT_moon <- combined_ATT_moon %>% group_by(Nodes) %>% summarise_each(funs(mean(., na.rm = TRUE))) -``` + combined_ATT_moon <- as.data.frame(rbind(ATT_rec_to_TFmetab, ATT_TF_lig)) + combined_ATT_moon <- combined_ATT_moon %>% group_by(Nodes) %>% summarise_each(funs(mean(., na.rm = TRUE))) ## Warning: `summarise_each()` was deprecated in dplyr 0.7.0. ## ℹ Please use `across()` instead. @@ -1632,40 +1458,37 @@ combined_ATT_moon <- combined_ATT_moon %>% group_by(Nodes) %>% summarise_each(fu ## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was ## generated. -``` r -combined_ATT_moon <- as.data.frame(combined_ATT_moon) + combined_ATT_moon <- as.data.frame(combined_ATT_moon) -ligrec_ressource_addon <- ligrec_ressource[ - ligrec_ressource$source_genesymbol %in% combined_SIF_moon$target & - ligrec_ressource$target_genesymbol %in% combined_SIF_moon$source -, c(10,12)] -ligrec_ressource_addon$sign <- 1 -ligrec_ressource_addon$Weight <- 1 -names(ligrec_ressource_addon)[c(1,2)] <- c("source","target") -ligrec_ressource_addon <- unique(ligrec_ressource_addon) + ligrec_ressource_addon <- ligrec_ressource[ + ligrec_ressource$source_genesymbol %in% combined_SIF_moon$target & + ligrec_ressource$target_genesymbol %in% combined_SIF_moon$source + , c(10,12)] + ligrec_ressource_addon$sign <- 1 + ligrec_ressource_addon$Weight <- 1 + names(ligrec_ressource_addon)[c(1,2)] <- c("source","target") + ligrec_ressource_addon <- unique(ligrec_ressource_addon) -combined_SIF_moon <- as.data.frame(rbind(combined_SIF_moon, ligrec_ressource_addon)) + combined_SIF_moon <- as.data.frame(rbind(combined_SIF_moon, ligrec_ressource_addon)) -#It appears I may need to only consider direct TF regulations for TF to lig + #It appears I may need to only consider direct TF regulations for TF to lig -write_csv(combined_SIF_moon,file = "results/cosmos/moon/combined_SIF_moon.csv") -write_csv(combined_ATT_moon,file = "results/cosmos/moon/combined_ATT_moon.csv") + write_csv(combined_SIF_moon,file = "results/cosmos/moon/combined_SIF_moon.csv") + write_csv(combined_ATT_moon,file = "results/cosmos/moon/combined_ATT_moon.csv") -write_csv(moon_res_rec_to_TFmet, file = "results/cosmos/moon/moon_res_rec_to_TFmet.csv") -write_csv(moon_res_TF_lig, file = "results/cosmos/moon/moon_res_TF_lig.csv") -``` + write_csv(moon_res_rec_to_TFmet, file = "results/cosmos/moon/moon_res_rec_to_TFmet.csv") + write_csv(moon_res_TF_lig, file = "results/cosmos/moon/moon_res_TF_lig.csv") ### Further analyses Further MOFA-COSMOS downstream analyses focusing on analyzing network control structures and cell line specific MOFA transcriptomics are given -in ./pathway_control_analysis_moon.Rmd and ./Cell_line_MOFA_space.Rmd +in ./pathway\_control\_analysis\_moon.Rmd and +./Cell\_line\_MOFA\_space.Rmd ### Session info -``` r -sessionInfo() -``` + sessionInfo() ## R version 4.2.0 (2022-04-22) ## Platform: aarch64-apple-darwin20 (64-bit) diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Activity estimations-1.png b/vignettes/MOFA_to_COSMOS_files/figure-gfm/Activity estimations-1.png deleted file mode 100644 index a85cff1..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Activity estimations-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Activity estimations-2.png b/vignettes/MOFA_to_COSMOS_files/figure-gfm/Activity estimations-2.png deleted file mode 100644 index 4c73447..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Activity estimations-2.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Correlation between factors-1.png b/vignettes/MOFA_to_COSMOS_files/figure-gfm/Correlation between factors-1.png deleted file mode 100644 index 8f4ac8f..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Correlation between factors-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Factor value per sample and factor-1.png b/vignettes/MOFA_to_COSMOS_files/figure-gfm/Factor value per sample and factor-1.png deleted file mode 100644 index f0e4b42..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Factor value per sample and factor-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Factor value per sample and factor-2.png b/vignettes/MOFA_to_COSMOS_files/figure-gfm/Factor value per sample and factor-2.png deleted file mode 100644 index df96d53..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Factor value per sample and factor-2.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Factor value per sample and factor-3.png b/vignettes/MOFA_to_COSMOS_files/figure-gfm/Factor value per sample and factor-3.png deleted file mode 100644 index 4b95cb3..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Factor value per sample and factor-3.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Heatmap RNA view factor 2-1.png b/vignettes/MOFA_to_COSMOS_files/figure-gfm/Heatmap RNA view factor 2-1.png deleted file mode 100644 index 3ba180e..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Heatmap RNA view factor 2-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Heatmap RNA view factor 4-1.png b/vignettes/MOFA_to_COSMOS_files/figure-gfm/Heatmap RNA view factor 4-1.png deleted file mode 100644 index 3ba180e..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Heatmap RNA view factor 4-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Plot b/vignettes/MOFA_to_COSMOS_files/figure-gfm/Plot deleted file mode 100644 index e69de29..0000000 diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Activity estimations-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Activity estimations-1.png deleted file mode 100644 index a85cff1..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Activity estimations-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Activity estimations-2.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Activity estimations-2.png deleted file mode 100644 index 4c73447..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Activity estimations-2.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Compare MOFA models-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Compare MOFA models-1.png deleted file mode 100644 index 37dbcfa..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Compare MOFA models-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Correlation between factors-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Correlation between factors-1.png deleted file mode 100644 index 8f4ac8f..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Correlation between factors-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Factor value per sample and factor-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Factor value per sample and factor-1.png deleted file mode 100644 index aa65a8d..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Factor value per sample and factor-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Factor value per sample and factor-2.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Factor value per sample and factor-2.png deleted file mode 100644 index bfff1a2..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Factor value per sample and factor-2.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Factor value per sample and factor-3.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Factor value per sample and factor-3.png deleted file mode 100644 index f8b9a33..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Factor value per sample and factor-3.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Factor weights per view-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Factor weights per view-1.png deleted file mode 100644 index 55642b9..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Factor weights per view-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Heatmap RNA view factor 2-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Heatmap RNA view factor 2-1.png deleted file mode 100644 index 3ba180e..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Heatmap RNA view factor 2-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Plot b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Plot deleted file mode 100644 index e69de29..0000000 diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-1.png deleted file mode 100644 index 8608259..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-2.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-2.png deleted file mode 100644 index e5bbbad..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-2.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-3.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-3.png deleted file mode 100644 index 597c396..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-3.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-4.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-4.png deleted file mode 100644 index e6a73f1..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-4.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-5.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-5.png deleted file mode 100644 index 7fe15b1..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Preparing MOFA input-5.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Total variance per factor-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Total variance per factor-1.png deleted file mode 100644 index 6f091ed..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Total variance per factor-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Variance per view per factor-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Variance per view per factor-1.png deleted file mode 100644 index b35b6b7..0000000 Binary files a/vignettes/MOFA_to_COSMOS_files/figure-markdown_github/Variance per view per factor-1.png and /dev/null differ diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Activity estimations for factor 4-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Activity estimations for factor 4-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Activity estimations for factor 4-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Activity estimations for factor 4-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Activity estimations for factor 4-2.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Activity estimations for factor 4-2.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Activity estimations for factor 4-2.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Activity estimations for factor 4-2.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Chekc cross correlation of omics compared to variance epxlained-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Chekc cross correlation of omics compared to variance epxlained-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Chekc cross correlation of omics compared to variance epxlained-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Chekc cross correlation of omics compared to variance epxlained-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Chekc cross correlation of omics compared to variance epxlained-2.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Chekc cross correlation of omics compared to variance epxlained-2.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Chekc cross correlation of omics compared to variance epxlained-2.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Chekc cross correlation of omics compared to variance epxlained-2.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Chekc cross correlation of omics compared to variance epxlained-3.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Chekc cross correlation of omics compared to variance epxlained-3.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Chekc cross correlation of omics compared to variance epxlained-3.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Chekc cross correlation of omics compared to variance epxlained-3.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Compare MOFA models-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Compare MOFA models-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Compare MOFA models-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Compare MOFA models-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Factor weights per view-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Factor weights per view-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Factor weights per view-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Factor weights per view-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Plot: Ligand-receptor activity estimates-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Plot Ligand-receptor activity estimates-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Plot: Ligand-receptor activity estimates-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Plot Ligand-receptor activity estimates-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Plot: Metabolite factor weights-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Plot Metabolite factor weights-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Plot: Metabolite factor weights-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Plot Metabolite factor weights-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Plot: RNA weights & protein weights-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Plot RNA weights and protein weights-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Plot: RNA weights & protein weights-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Plot RNA weights and protein weights-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Plot: RNA weights & protein weights-2.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Plot RNA weights and protein weights-2.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Plot: RNA weights & protein weights-2.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Plot RNA weights and protein weights-2.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Plot: TF activity estimates-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Plot TF activity estimates-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Plot: TF activity estimates-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Plot TF activity estimates-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Preparing MOFA input-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Preparing MOFA input-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Preparing MOFA input-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Preparing MOFA input-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Preparing MOFA input-2.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Preparing MOFA input-2.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Preparing MOFA input-2.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Preparing MOFA input-2.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Preparing MOFA input-3.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Preparing MOFA input-3.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Preparing MOFA input-3.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Preparing MOFA input-3.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Preparing MOFA input-4.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Preparing MOFA input-4.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Preparing MOFA input-4.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Preparing MOFA input-4.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Preparing MOFA input-5.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Preparing MOFA input-5.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Preparing MOFA input-5.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Preparing MOFA input-5.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Run moon rec_to_TFmetab-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Run moon rec_to_TFmetab-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Run moon rec_to_TFmetab-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Run moon rec_to_TFmetab-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Supp figure condition_prot_RNA_correlation-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Supp figure condition_prot_RNA_correlation-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Supp figure condition_prot_RNA_correlation-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Supp figure condition_prot_RNA_correlation-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Supp figure single_prot_RNA_correlation-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Supp figure single_prot_RNA_correlation-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Supp figure single_prot_RNA_correlation-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Supp figure single_prot_RNA_correlation-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Total variance per factor-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Total variance per factor-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Total variance per factor-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Total variance per factor-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/Variance per view per factor-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Variance per view per factor-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/Variance per view per factor-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/Variance per view per factor-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/correlation RNA/prot-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/correlation RNA/prot-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/correlation RNA/prot-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/correlation RNA/prot-1.png diff --git a/vignettes/MOFA_to_COSMOS_files/figure-gfm/run moon TF to lig-1.png b/vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/run moon TF to lig-1.png similarity index 100% rename from vignettes/MOFA_to_COSMOS_files/figure-gfm/run moon TF to lig-1.png rename to vignettes/MOFA_to_COSMOS_files/figure-markdown_strict/run moon TF to lig-1.png