Skip to content

Data processing and analysis of database search and quant proteomics results from GBM patients - Initial vs Recurrent conditions

License

Notifications You must be signed in to change notification settings

MiguelCos/gbm_manuscript_data_analysis

Repository files navigation

Reproducible Data Analysis for the Manuscript 'Proteometabolomics of initial and recurrent glioblastoma highlights an increased immune cell signature with altered lipid metabolism'

DOI

Reproducible reports:

General Proteomics analysis

The general data analysis and processing of the search results after peptide/protein identification and quantitation can be accessed via the general proteomics reproducible report in this repo.

Large scale analyses of proteolytic processing

The reproducible report, containing code for data preprocessing, statisical analysis and intermediary plots for the large-scale differential analysis of proteolytic processing in recurrent glioblastoma can be found accessed via: large-scalge proteolytic processing analysis reproducible report

Integrative lipidomics and proteomics

The reproducible report, containing code for data preprocessing, statisical analysis and intermediary plots for the large-scale differential analysis of proteolytic processing in recurrent glioblastoma can be found accessed via: integrative lipidomics+proteomics reproducible report

Proteogenomics analyses

The reproducible report, containing code for data preprocessing, statisical analysis and intermediary plots for the large-scale differential analysis of proteolytic processing in recurrent glioblastoma can be found accessed via: proteogenomics reproducible report

To regenerate analysis for Plasma ELISA Proteomics, knit the manuscript_elisa_figure_final.Rmd r notebook. Re-analysis of the Cox proportional hazards model including Age and Sex and covariates can be found here

Single-cell RNAseq data mining

We mined the single-cell RNAseq dataset on recurrent GBM GSM4972210, to explore the expression of ASAH1 by annotated cell type. Reproducible report available here

Specific R functions for data processing and analysis

Functions for peptide annotation and analysis of proteolytic processing

We have written a series of functions for the annotation of peptides according to their location in the protein sequence and proteolytic specificity. These would support the quantitative analysis of products of proteolytic processing, specifically in the context of TMT-based isobaric quantitation.

Used to map peptides to their corresponding protein in a fasta file and annotate them in terms of their position within the protein sequence, amino acids before and after, and specificity type. This function is also used to evaluate if an identified peptide contains a single amino acid variant, based on its called position.

Evaluates if the peptide N-term contains a TMT-tag, acetylation or if it is free.

After identifying a set of interesting semi-specific peptides/proteolytic products, this function was used to generate peptide sequences that catch the residues in the vicinity (i.e. 10 amino acids after and before) of the non-tryptic cleavage area. This, as a preparation for the analysis of sequence motifs.