The Proteomics Data Aanalysis material was prepared by the MRC Toxicology unit Bioinformatics and mass4tox Proteomics facilities to provide training in the basics of proteomics analyses.
It assume the user’s data has been processed by Proteome Discoverer, as per standard Proteomics facility workflows
Tutorials take the form of Rmarkdown notebooks (see links below). If you would like to contribute or suggest modifications to the material, please see the github page
You should be comfortable using R
. We will be using
base R
functions like lapply
, gsub
, file.path
, alongside tidyverse
functions
like group_by
, mutate
and ggplot
. If these are not familiar, we recommend
undertaking training in R
and the tidyverse
beforehand. We recommend using R>=4.1.2
since
the material has not been tested on earler versions.
The [Bioinformatics](https://www.mrc-tox.cam.ac.uk/facilities/bioinformatics) facility
provide separate training covering basic R
, data carpentry (using the tidyverse
)
and plotting (using ggplot2
). If there is not a course scheduled, you can get
recordings by emailing [email protected].
The
Cambridge Bioinformatics Training centre also offer a regular course on R for Biologists
The material will be taught in live coding sessions through Rstudio and we recommend using this environment whenever you use R. Installation instructions can be found here
The materials herein assume you have attended Cat Franco's introduction to the principles of bottom-up proteomics by Mass-Spectrometry.
To ensure all the neccessary R packages are installed for you to run the code,
you can install the Protoemics.data.analysis
package like so:
remotes::install_github("MRCToxBioinformatics/Proteomics_data_analysis", dependencies='Suggests')
This will also install the Proteomics.analysis.data
package which contains
the data we will use.
The first part of the course is broken into sections for different 'flavours' of quantitative bottom-up proteomics by Mass-spectrometry. Each section contains a subsection covering:
- Data processing and QC which starts from the Proteome Discoverer (PD) output files and performs filtering, quality control and data processing to obtain the quantification data
- Statistical testing for differential abundance
Additional subsections are included to cover further topics for each flavour.
In addition to the core part of the course, there are extended materials to cover:
- Phosphoproteomics using Tandem Mass Tags