ML Models for Mutation Impact Analysis

This repository contains implementations of various machine learning models to predict the impact of mutations using mutation data and protein sequence data. The models implemented are:

Random Forest
Support Vector Machine (SVM)
Convolutional Neural Network (CNN)
Gated Recurrent Unit (GRU)

Data

Two main datasets are used in this project:

Mutation Data: Contains information about mutations in BRCA1 and BRCA2 genes.
Protein Data: Contains protein sequences related to BRCA1 and BRCA2 genes.

Models and Results

1. Random Forest

The Random Forest model was trained using the mutation data. The model achieved the following accuracy across 5-fold cross-validation:

Average Accuracy: 0.95

2. Support Vector Machine (SVM)

The SVM model was trained using the mutation data. The model achieved the following accuracy across 5-fold cross-validation:

Average Accuracy: 0.94

3. Convolutional Neural Network (CNN)

The CNN model was trained using the mutation data. The model achieved the following accuracy across 5-fold cross-validation:

Average Accuracy: 0.97

4. Gated Recurrent Unit (GRU)

The GRU model was trained using the mutation data. The model achieved the following accuracy across 5-fold cross-validation:

Average Accuracy: 0.97

How to Run

Clone the repository:

git clone https://github.com/yonas650/ML-Models-for-Mutation-Impact-analysis.git

Navigate to the project directory:

cd ML-Models-for-Mutation-Impact-analysis

Install the required dependencies:
```
pip install -r requirements.txt
```

Run the models:

python random_forest.py
python svm.py
python cnn.py
python gru.py

Dependencies

Python 3.x
pandas
numpy
scikit-learn
imbalanced-learn
xgboost
torch (for CNN and GRU models)
matplotlib
seaborn

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
visualizations		visualizations
LICENSE		LICENSE
README.md		README.md
cnn.py		cnn.py
gru.py		gru.py
mutations.txt		mutations.txt
preprocess.py		preprocess.py
random_forest.py		random_forest.py
svm.py		svm.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Models for Mutation Impact Analysis

Data

Models and Results

1. Random Forest

2. Support Vector Machine (SVM)

3. Convolutional Neural Network (CNN)

4. Gated Recurrent Unit (GRU)

How to Run

Dependencies

License

About

Releases

Packages

Languages

License

Yonas650/ML-Models-for-Mutation-Impact-analysis

Folders and files

Latest commit

History

Repository files navigation

ML Models for Mutation Impact Analysis

Data

Models and Results

1. Random Forest

2. Support Vector Machine (SVM)

3. Convolutional Neural Network (CNN)

4. Gated Recurrent Unit (GRU)

How to Run

Dependencies

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages