‘multiOmics data analysis, integration, and visualisation protocol

Data integration and modelling with R
Systems biology analysis and visualization pipeline in R

Data management framework

pISA-tree on GitHub
- Petek, M., Zagorščak, M., Blejec, A. et al. pISA-tree - a data management framework for life science research projects using a standardised directory tree. Sci Data 9, 685 (2022). https://doi.org/10.1038/s41597-022-01805-5

Expected measurements

one or multiple genotypes
under single and multiple abiotic/biotic stressors
experiment duration: XY hours, days, ... : time-series experimental design
tissue: single or multiple
Omics' strategies:
- Hormonomics
- Transcriptomics
- Proteomics
- Metabolomics
- Phenomics

Analysis steps:

Design Phenodata, a master experimental design table describing samples for analysis, prior to sample collection according to good data management practice
Data preprocessing and overall inspection

detection of outliers and faulty measurements
data transformation (if needed)
- {multimode}
- {fitdistrplus}
- {caret}
- {glmnet}
- {MASS}
- {BIGL}
- {robustbas}
- {preprocessCore}
- {compositions}
- {mgcv}
interpolation
- point-to-point
  - approxfun {stats} - Returns a list of points which linearly interpolate given data points, or a function performing the linear (or constant) interpolation.
- polynomial
  - predict {stats} - A generic function for predictions from the results of various model fitting functions
  - boxplot.stats {grDevices} - Box Plot Statistics
  - mad {BiocGenerics} - Compute the median absolute deviation for a vector
  - aq.plot {mvoutlier} - Adjusted quantile plots for multivariate outlier detection
extrapolation
imputation
- for qPCR see Baebler, Š., Svalina, M., Petek, M. et al. quantGenius: implementation of a decision support system for qPCR-based gene quantification. BMC Bioinformatics 18, 276 (2017). https://doi.org/10.1186/s12859-017-1688-7

Statistical analysis of individual omics data layers
- ggplot {ggplot2} - various plots, https://r-graphics.org/chapter-ggplot2
- corr.test {psych} - Find the correlations, sample sizes, and probability values between elements of a matrix or data.frame
- cor.plot {psych} - Create an image plot for a correlation or factor matrix
- pairs.panels {psych} - SPLOM, histograms and correlations for a data matrix
- rcorr {Hmisc} - Matrix of Correlations and P-values
- heatmaply_cor {heatmaply} - Cluster heatmap based on plotly
- corrplot {corrplot} - A visualization of a correlation matrix
- pheatmap {pheatmap} - A function to draw clustered heatmaps
- t_test {rstatix}
- ggdotplot, ggviolin {ggpubr}
- metaMDS {vegan} - Nonmetric Multidimensional Scaling with Stable Solution from Random Starts, Axis Scaling and Species Scores
- {limma} for e.g. non-targeted Proteomics, RNA-seq, ..
  - limma::lmFit - Linear Model for Series of Arrays
  - limma::makeContrasts - Construct Matrix of Custom Contrasts
  - limma::contrasts.fit - Compute Contrasts from Linear Model Fit
  - limma::eBayes - Empirical Bayes Statistics for Differential Expression
  - limma::decideTests - Multiple Testing Across Genes and Contrasts
  - limma::topTable - Table of Top Genes from Linear Model Fit
Correlation based network inference within each omics level
- Leave-One-Out graphs
  - qgraph {qgraph}
  - igraph
- Lioness
Since results from both methods heavily depend on selected thresholds, Lioness node and edge selection using FDR being even more sensitive on correlation difference cut-off, we suggest to use an automated graph thresholding approach.
Integration across different omics datasets

Canonical Correlation Analysis
N-Integration Discriminant Analysis with DIABLO
- {mixOmics}
  - block.splsda {mixOmics} N-integration and feature selection with Projection to Latent Structures models (PLS) with sparse Discriminant Analysis
  - plotDiablo {mixOmics} Graphical output for the DIABLO framework
  - plotVar {mixOmics} Plot of Variables
  - plotIndiv {mixOmics} Plot of Individuals (Experimental Units)
  - plotArrow {mixOmics} Arrow sample plot
  - circosPlot {mixOmics} circosPlot for DIABLO
  - cimDiablo {mixOmics} Clustered Image Maps (CIMs) ("heat maps") for DIABLO
  - network {mixOmics} Relevance Network for (r)CCA and (s)PLS regression
Leave-One-Out graphs
- qgraph {qgraph}
- igraph

Integration of data with prior knowledge

Start of the analysis:

Data is expected to be arranged within data management framework, with complete and descriptive metadata files, including Phenodata file.
'Omics files are expected to be preprocessed (see suggestions in Step 2).
Minimal input files can be found within './input' directory.
For Step 3: Statistical analysis of individual omics data layers run script 01_Step3.Rmd
For Step 4: Correlation based network inference within/between each omics level run script 02_Step4.Rmd
For Step 5: Integration across different omics datasets run script 03_Step5.Rnw

For more info see multiOmics_data_analysis_Protocol

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
_p_Omics		_p_Omics
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
_PROJECT_METADATA.TXT		_PROJECT_METADATA.TXT
multiOmics_data_analysis_Protocol.docx		multiOmics_data_analysis_Protocol.docx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

‘multiOmics data analysis, integration, and visualisation protocol

Data management framework

Expected measurements

Analysis steps:

Start of the analysis:

About

Releases

Packages

Contributors 3

Languages

License

NIB-SI/multiOmics-integration

Folders and files

Latest commit

History

Repository files navigation

‘multiOmics data analysis, integration, and visualisation protocol

Data management framework

Expected measurements

Analysis steps:

Start of the analysis:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages