Adjusting for false discoveries in constraint- and sampling-based differential metabolic flux analysis
Submitted to: Journal of Biomedical Informatics
- Bruno G. Galuzzi [email protected] (a, c)
- Luca Milazzo [email protected] (b)
- Chiara Damiani [email protected] (a, c)
(a) Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza dell’Ateneo Nuovo, 1, Milan, 20125, Italy
(b) Department of Informatics, Systems, and Communications, University of Milano-Bicocca, Piazza dell’Ateneo Nuovo, 1, Milan, 20125, Italy
(c) SYSBIO Centre of Systems Biology/ ISBE.IT, Milan, Milan,
- Different samples of the very same feasible region of a metabolic network can produce different marginal flux distributions, with the risk of false discoveries
- For Hit-and-Run strategies, the thinning value has a higher impact on false discoveries than the sample size.
- Hypothesis test on KL-divergence fully correct for false discoveries
- Sampling the corners of a feasible region with random functions is less prone to false discoveries and produces marginal flux distributions dif- ferent from the ones of Hit-and-Run strategies.
- JupyterLab - version 3.2.1
- Python - version 3.9.7
- Matlab - version 9.8.0.1396136
- R - version 4.2.2
- CobraToolBox - https://opencobra.github.io/cobratoolbox/stable/installation.html
This repository was built in such a manner that allows users to easily reproduce the entire analysis pipeline. In particular, the code automatically handles the folders structures and files by using relative paths. It is mandatory to follow the instructions reported at the beginning of each file in the "code" folder in order to correctly execute all the analysis. The following execution order shall be respected:
- sampling.ipynb
- sampling_CHRR.mat
- KLD.R
- KLD.ipynb
- convergence.R
- convergence.ipynb
- FDR.ipynb
- CHRR_CBS3.ipynb
- visualization.ipynb