Skip to content

This repository contains my solutions for the assignments and projects of Machine Learning for Bioinformatics Course (Graduate Course) at Sharif University of Technology (Spring 2020)

Notifications You must be signed in to change notification settings

AlirezAkbary/ML_Bio_Course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning for Bioinformatics

My Solutions for Machine Learning for Bioinformatics Course (Graduate Course) Assignments and Research Project.

Project: Drug-Protein Affinity Classification

The project is about the prediction of binding between proteins and drugs. In phase 1, this is done by machine learning algorithms (XGBoost).

In phase 2, the problem was to improve the limitations of the well-known model DeepDTA. By reading the state-of-the-art paper GraphDTA, which takes advantage of Graph Neural Networks, I modified DeepDTA by implementing LSTM to learn protein sequence (as DeepDTA doesn't take the sequential nature of target amino-acid structures into account) and graph convolutional network to learn drug structure. Also, I applied some interpretability methods to analyze the network learned on data and got valuable insights that the learned model is overly dependent on the drugs without a reasonable focus on the proteins.

My literature review on the topic consisted:

  • DeepDTA: Deep Drug-Target Binding Affinity Prediction (arxiv)
  • GraphDTA: prediction of drug–target binding affinity using graph convolutional networks (bioarxiv)
  • DeepGS: Deep Representation Learning of Graphs and Sequences for Drug-Target Binding Affinity Prediction (arxiv)
  • Saliency Maps DNN Interpretation or: Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps (arxiv)
  • Guided Back Propagation DNN Interpretation or:STRIVING FOR SIMPLICITY: THE ALL CONVOLUTIONAL NET (arxiv)
  • LRP DNN Interpretation or:On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation (arxiv)
  • Microsoft Research Tutorials on Graph Neural Networks. (link), (link)
  • PyTorch Geometric Extension Library. (link)

Homeworks

HW6

Covered topics:

  • Autoencoders
  • VAE (Theory & Implementation) (arxiv)
  • GAN (arxiv), Wasserstein GAN, Mode Collapse and Mini Batch Discrimination (arxiv), (link)
  • RNN, LSTM (Theory & Implementation)

HW5

Covered topics:

  • Hidden Markov Models
  • Deep Learning Basics (Also a more rigorous view on Batch Normalization by the paper: How Does Batch Normalization Help Optimization? (arxiv) and SGD Optimization in Over-parameterized Network by: A Convergence Theory for Deep Learning via Over-Parameterization (arxiv))
  • Universal Approximation of Neural Networks
  • MLP Implementation from Scratch
  • Reading and Implementation of ResNet Paper with PyTorch.(arxiv) Also a more rigorous view on ResNet by the papers: Visualizing the Loss Landscape of Neural Nets (arxiv) and Deep Residual Networks, Deep Learning Gets Way Deeper by Kaiming He.(link)

HW4

Covered topics:

  • PCA (Theory & Implementation), ICA
  • K-Means
  • GMM (Theory & Implementation), Expectation Maximization and Variational Lower Bound
  • Reading t-SNE paper (link)

HW3

Covered topics:

  • Ensemble Learning, Bagging, Boosting such as Random Forest, AdaBoost (Theory & Implementation)
  • Feature Selection (Bayesian Networks, Markov Blanket, and d-separation - LASSO Regularizer)

HW2

Covered topics:

  • Perceptron (Theory & Implementation)
  • Support Vector Machine (Theory & Implementation)
  • Kernel Methods

HW1

Covered topics:

  • Basics of Information Theory
  • Decision Tree (Theory & Implementation)
  • KNN (Theory & Implementation)
  • Hypothesis Testing of The Performance of the Models

HW0

Covered topics:

  • Review of Multivariable calculus
  • Review of Linear Algebra
  • Review of Probability & Statistics

About

This repository contains my solutions for the assignments and projects of Machine Learning for Bioinformatics Course (Graduate Course) at Sharif University of Technology (Spring 2020)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published