Proteomics Data Analysis

The Proteomics Data Aanalysis material was prepared by the MRC Toxicology unit Bioinformatics and mass4tox Proteomics facilities to provide training in the basics of proteomics analyses.

It assume the user’s data has been processed by Proteome Discoverer, as per standard Proteomics facility workflows

Tutorials take the form of Rmarkdown notebooks (see links below). If you would like to contribute or suggest modifications to the material, please see the github page

Prerequisites

R

You should be comfortable using R. We will be using base R functions like lapply, gsub, file.path, alongside tidyverse functions like group_by, mutate and ggplot. If these are not familiar, we recommend undertaking training in R and the tidyverse beforehand. We recommend using R>=4.1.2 since the material has not been tested on earler versions.

The [Bioinformatics](https://www.mrc-tox.cam.ac.uk/facilities/bioinformatics) facility

provide separate training covering basic R, data carpentry (using the tidyverse) and plotting (using ggplot2). If there is not a course scheduled, you can get recordings by emailing bioinfo@mrc-tox.cam.ac.uk.

The

Cambridge Bioinformatics Training centre also offer a regular course on R for Biologists

RStudio

The material will be taught in live coding sessions through Rstudio and we recommend using this environment whenever you use R. Installation instructions can be found here

Proteomics

The materials herein assume you have attended Cat Franco's introduction to the principles of bottom-up proteomics by Mass-Spectrometry.

Course dependencies and data

To ensure all the neccessary R packages are installed for you to run the code, you can install the Protoemics.data.analysis package like so:

remotes::install_github("MRCToxBioinformatics/Proteomics_data_analysis", dependencies='Suggests')

This will also install the Proteomics.analysis.data package which contains the data we will use.

Course materials

The first part of the course is broken into sections for different 'flavours' of quantitative bottom-up proteomics by Mass-spectrometry. Each section contains a subsection covering:

Data processing and QC which starts from the Proteome Discoverer (PD) output files and performs filtering, quality control and data processing to obtain the quantification data
Statistical testing for differential abundance

Additional subsections are included to cover further topics for each flavour.

In addition to the core part of the course, there are extended materials to cover:

Phosphoproteomics using Tandem Mass Tags

1. Label-Free Quantification (LFQ)

Data processing and QC
Statistical testing
Comparing robust and maxLFQ summarisation to protein-level abundances
An alternative normalisation using a prior expectation

2. Tandem-Mass Tags (TMT)

Data processing and QC
Statistical testing

3. Stable Isotope Labelling by/with Amino acids in Cell culture (SILAC)

Data processing and QC
Statistical testing
Incorporation rate testing

Extended materials

Phosphoproteomics using TMT
Phosphoproteomics statistical testing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index.md

index.md

Proteomics Data Analysis

Prerequisites

R

RStudio

Proteomics

Course dependencies and data

Course materials

1. Label-Free Quantification (LFQ)

2. Tandem-Mass Tags (TMT)

3. Stable Isotope Labelling by/with Amino acids in Cell culture (SILAC)

Extended materials

Additional resources

Files

index.md

Latest commit

History

index.md

File metadata and controls

Proteomics Data Analysis

Prerequisites

R

RStudio

Proteomics

Course dependencies and data

Course materials

1. Label-Free Quantification (LFQ)

2. Tandem-Mass Tags (TMT)

3. Stable Isotope Labelling by/with Amino acids in Cell culture (SILAC)

Extended materials

Additional resources