Skip to content

Commit

Permalink
Update paper/paper.md
Browse files Browse the repository at this point in the history
Co-authored-by: Dani Bodor <[email protected]>
  • Loading branch information
gcroci2 and DaniBodor authored Sep 15, 2023
1 parent 5725b82 commit b30a7bc
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ We present DeepRank2, a deep learning (DL) framework geared towards making predi
The 3D structure of proteins and protein complexes provides fundamental information to understand biological processes at the molecular scale. Exploiting or engineering these molecules is key for many biomedical applications such as drug design [@GANE2000401], immunotherapy [@sadelain_basic_2013], or designing novel proteins [@nonnaturalppi]. For example, PPI data can be harnessed to address critical challenges in the computational prediction of peptides presented on the major histocompatibility complex (MHC) protein, which play a key role in T-cell immunity. Protein structures can also be exploited in molecular diagnostics for the identification of SRVs, that can be pathogenic sequence alterations in patients with inherited diseases [@mut_cnn; @shroff].

[comment]: <> (What makes using 3D protein structures with DL possible)
In the past decades, a variety of experimental methods (e.g., X-ray crystallography, nuclear magnetic resonance, cryogenic electron microscopy) have determined and accumulated a large number of atomic-resolution 3D structures of proteins and protein-protein complexes [@schwede_protein_2013]. Because experimental determination of structures is a tedious and expensive process, several computational prediction methods have also been developed over the past few years, such as Alphafold [@alphafold_2021] for single proteins, and PANDORA [@pandora], HADDOCK [@haddock], and Alphafold-Multimer [@alphafold_multi] for protein complexes. The large amount of data available makes it possible to use DL to leverage 3D structures and learn their complex patterns. Unlike other machine learning (ML) techniques, deep neural networks hold the promise of learning from millions of data without reaching a performance plateau quickly, which is made computationally feasible by hardware accelerators (i.e., GPUs, TPUs) and parallel file system technologies.
In the past decades, a variety of experimental methods (e.g., X-ray crystallography, nuclear magnetic resonance, cryogenic electron microscopy) have determined and accumulated a large number of atomic-resolution 3D structures of proteins and protein-protein complexes [@schwede_protein_2013]. Because experimental determination of structures is a tedious and expensive process, several computational prediction methods have also been developed over the past few years, such as Alphafold [@alphafold_2021] for single proteins, and PANDORA [@pandora], HADDOCK [@haddock], and Alphafold-Multimer [@alphafold_multi] for protein complexes. The large amount of data available makes it possible to use DL to leverage 3D structures and learn their complex patterns. Unlike other machine learning (ML) techniques, deep neural networks hold the promise of learning from millions of data points without reaching a performance plateau quickly, which is made computationally feasible by hardware accelerators (i.e., GPUs, TPUs) and parallel file system technologies.

[comment]: <> (Examples of DL with PPIs and SRVs)
3D CNNs have been trained on 3D grids for the classification of biological vs. crystallographic PPIs [@renaud_deeprank_2021], and for the scoring of models of protein-protein complexes generated by computational docking [@renaud_deeprank_2021; @dove]. Gaiza et al. have applied geodesic CNNs to extract protein interaction fingerprints by applying 2D CNNs on spread-out protein surface patches [@masif]. 3D CNNs have been used for exploiting protein structure data for predicting mutation-induced changes in protein stability [@mut_cnn] and identifying novel gain-of-function mutations [@shroff]. Contrary to CNNs, in GNNs the convolution operations on graphs can rely on the relative local connectivity between nodes and not on the data orientation, making graphs rotational invariant. Additionally, GNNs can accept any size of graph, while in a CNN the size of the 3D grid for all input data needs to be the same, which may be problematic for datasets containing highly variable in size structures. Based on these arguments, different GNN-based tools have been designed to predict patterns from PPIs [@dove_gnn; @fout_protein_nodate; @reau_deeprank-gnn_2022]. Eisman et al. developed a rotation-equivariant neural network trained on point-based representation of the protein atomic structure to classify PPIs [@rot_eq_gnn].
Expand Down

0 comments on commit b30a7bc

Please sign in to comment.