Exposome is the totality of all exposures that we encounter during our life. Chemicals are the main component of the exposome. There can be thousands of chemicals that we are exposed to in our daily life. Some of them, such as hormones are generated by our own body, and some, such as phthalates or PFAS are introduced by the consumer industry. To monitor these exposures, our body's samples, such as blood, urine, stool, hair, nails, saliva can be analyzed using advanced machines such as mass spectrometry that can measure thousands of chemicals in a snapshot. These machines can generate mountains of data in a matter of hours for a single person's samples. One of the key challenges we are facing is to interpret these data in the context of human diseases and to prioritize novel chemical exposures.
IDSL.ME team develops software, databases and novel approaches to analyze and interpret metabolomics and exposomics datasets in population-scale studies. More details about the group's research and resources can be found at this IDSL.ME. The group is part of the Department of Environmental Medicine and Public Health, EMPH and the Institute of Exposomics Research at the Icahn School of Medicine at Mount Sinai, ISMMS, New York, USA.
Principal Investigator: Dinesh Barupal
- IDSL.MXP : A light-weight parser for mzML, netCDF and mzXML files
- IDSL.IPA : To generate comprehensive data matrices from an untargeted LC/GC - HRMS dataset
- IDSL.UFA : To annotate MS1 level data will molecular formula using isotope profile similarity
- IDSL.CSA : To annotate peaks using a compositie spectra created using MS1 only data
- IDSL.FSA : To annotate peaks using fragmentation data generated using DIA and DDA methods
- IDSL_MINT : A python workflow for training transformer models to predict molecular fingerprints from a MS/MS spectra
- IDSL.GOA : query the Gene Ontology Database for a multi-omics data interpretation
- ChemRICH : Metabolite set analysis that is independent of a background database
- MetaMapp : Metabolic network mapping using atomic mapping of reactions and chemical similarity
- ECID : Exposome Correlation and Interpretation Database (ECID) (NIEHS U24ES035386 Biomedical Knowledgebase)
- CCDB : a database of inter-chemical correlations
- Blood Exposome DB : A text mining driven catalogue of chemicals found in a blood sample
- Cancer Hazard Prioratization : To prioratize cancer hazards for IARC Monographs programme
- PubMed-FT : NLP guided queries of PubMed abstracts
- PMC-FT : NLP guided queries of full text data in the PMC database
- IDSL_MINT: a deep learning framework to predict molecular fingerprints from mass spectra. Baygi SF, Barupal DK. J Cheminform. 2024 Jan 18;16(1):8. doi: 10.1186/s13321-024-00804-5. PMID: 38238779
- IDSL.GOA: gene ontology analysis for interpreting metabolomic datasets. Mahajan P, Fiehn O, Barupal D. Sci Rep. 2024 Jan 14;14(1):1299. doi: 10.1038/s41598-024-51992-x.
- IDSL.CSA: Composite Spectra Analysis for Chemical Annotation of Untargeted Metabolomics Datasets, Baygi SF, Kumar Y, Barupal DK. Anal Chem. 2023 Jun 27;95(25):9480-9487. doi: 10.1021/acs.analchem.3c00376. Epub 2023 Jun 13.
- IDSL.UFA Assigns High-Confidence Molecular Formula Annotations for Untargeted LC/HRMS Data Sets in Metabolomics and Exposomics. Baygi SF, Banerjee SK, Chakraborty P, Kumar Y, Barupal DK. Anal Chem. 2022 Oct 4;94(39):13315-13322. doi: 10.1021/acs.analchem.2c00563. Epub 2022 Sep 22.
- IDSL.IPA Characterizes the Organic Chemical Space in Untargeted LC/HRMS Data Sets. Fakouri Baygi S, Kumar Y, Barupal DK. J Proteome Res. 2022 Jun 3;21(6):1485-1494. doi: 10.1021/acs.jproteome.2c00120
- CCDB: A database for exploring inter-chemical correlations in metabolomics and exposomics datasets. Barupal DK, Mahajan P, Fakouri-Baygi S, Wright RO, Arora M, Teitelbaum SL. Environ Int. 2022 Jun;164:107240. doi: 10.1016/j.envint.2022.107240. Epub 2022 Apr 18.
- Prioritizing cancer hazard assessments for IARC Monographs using an integrated approach of database fusion and text mining. Barupal DK, Schubauer-Berigan MK, Korenjak M, Zavadil J, Guyton KZ. Environ Int. 2021 Nov;156:106624. doi: 10.1016/j.envint.2021.106624. Epub 2021 May 10.
- Generating the Blood Exposome Database Using a Comprehensive Text Mining and Database Fusion Approach. Barupal DK, Fiehn O. Environ Health Perspect. 2019 Sep;127(9):97008. doi: 10.1289/EHP4713. Epub 2019 Sep 26.
- Chemical Similarity Enrichment Analysis (ChemRICH) as alternative to biochemical pathway mapping for metabolomic datasets. Barupal DK, Fiehn O. Sci Rep. 2017 Nov 6;7(1):14567. doi: 10.1038/s41598-017-15231-w.
- MetaMapp: mapping and visualizing metabolomic data by integrating information from biochemical pathways and chemical and mass spectral similarity. Barupal DK, Haldiya PK, Wohlgemuth G, Kind T, Kothari SL, Pinkerton KE, Fiehn O. BMC Bioinformatics. 2012 May 16;13:99. doi: 10.1186/1471-2105-13-99.
- NIEHS ( U24ES035386 ) Exposome Correlation and Interpretation Database (ECID) [2023-2028] PIs Dinesh Barupal and Susan Teitelbaum.
- NIEHS ( R01ES035478 ) Metal Mixtures, MicroRNAs and Metabolomics in Extracellular Vesicles, and Early-life Programming of Childhood Sleep Patterns: A Longitudinal Study [2024-2029] PIs Allison Kupsco and Dinesh Barupal
- IDSL-ME is contributing to several other NIH-funded projects (P30ES023515, U2CES026561, U2CES026555, U2CES030859, R01ES033688, UL1TR004419, R35ES030435, R01ES032831, UH3OD023337)
- Most of our software are written in the R and Python programming languages. For online tools, we are using the ReactJS framework. Submit your request to contribute to IDSL.ME codebase to [email protected] . Significant contributions will be credited with authorship in future manuscripts.
- We are always looking for bioinformatics programmers, post-doc fellows in metabolomics/exposomics/toxicology, data curators (omics, literature, biomonitoring), data science analysts. Reach out to [email protected] with your CV.