PINNED
is a neural network model designed to produce druggability scores for four separate contributions to protein druggability, as well as a final superscore representing the predicted likelihood that a protein is druggable.
The model consists of four separate deep neural networks: Sequence and structure
, Localization
, Biological functions
, and Network information
. Each subnetwork contains a single output neuron, whose scores are summed to generate an overall druggability score logit.
Prior to model training, fpocket_pipeline.ipynb
was used to generate the protein drug pocket scores found in fpocket_output.csv
, which were combined with other protein data stored in raw_data
. feature_processing.ipynb
was used to produce a single processed feature matrix, as well as a list of names of features to be inputted into each of the subnetworks, which can be found in processed_data
. Note that the actual feature matrix is not included here due to its size; however, it can be generated by running the feature_processing.ipynb
notebook.
Subsequently, PINNED_model.ipynb
was used to train and cross-validate the model and produce scores for each of the proteins in the dataset.
-
🗎
fpocket_pipeline.ipynb
-
🗎
feature_processing.ipynb
-
🗎
PINNED_model.ipynb
-
📁
raw_data
- 🗎
all_proteins.csv
- 🗎
dezso_features.csv
- 🗎
fpocket_output.csv
- 🗎
gdpc_10-14-22.csv
- 🗎
go_components_10-14-22.csv
- 🗎
go_functions_10-14-22.csv
- 🗎
go_processes_10-14-22.csv
- 🗎
paac_10-14-22.csv
- 🗎
-
📁
processed_data
- 🗎
bio_func_names.csv
- 🗎
localization_names.csv
- 🗎
network_info_names.csv
- 🗎
seq_and_struc_names.csv
- 🗎
-
🗎
README.md
- Michael Cunningham — developed the neural network model
- Danielle Pins — generated the fpocket data
Rights to AlphaFold and fpocket are governed by their respective licenses