Chest_Disease_Image_Classification

Build and deploy an end to end DL: Image classification application to AWS EC2 using Docker, CI/CD Jenkins
In this project, we aimed to revolutionize healthcare by accurately classifying chest diseases from CT scan images. This would enhance early diagnosis and treatment.
We leveraged Transfer Learning approach: downloaded a pre-trained Vgg16 model (CNN architexture) from Keras and fine-tuned the model to fit our custom dataset.
Fine tuning is done by dropping original dense layers and added a custon dense layer since our dataset has only two classes unlike Imagenet data used for pretraining Vgg16 had 1000 classes.
Fine tuned Vgg16 model was trained on a chest CT scan images dataset having two labels: Normal, adenocarcinoma.
Project structure is made with a data science project template. This template ensured modularity, reusability, and maintainability of the code. It included modules for logging, exception handling, and utilities.
Utilized DagsHub with MLflow for experiment tracking and model management, allowed us to track the experiments, compare results and manage models effectively.
Also integrated DVC (Data Version Control) for managing the data pipeline to ensure reproducibility and collaboration among the team members.

Flow of End to End robust automatic pipeline:

Data Ingestion: We ingested the CT scan images from Google drive using gdown package. Images were preprocessed to remove any noise and normalize the pixel values.
Prepare Base Model: We prepared a base CNN model using a pre-trained model, VGG16. Then customized VGG16 model to train on our dataset (dropped dense layer, added custom dense layer since our dataset had only two classes).
Model Trainer: We trained the custom CNN model on the prepared dataset. Then, used a training-validation split to ensure the model's generalization capabilities.
Model Evaluation: We evaluated the model's performance on a test dataset. We calculated metrics like accuracy, precision, recall, and F1-score.
MLflow Integration: We integrated MLflow with the model trainer and evaluator components. This allowed us to track the experiments and manage the models effectively.
DVC Pipeline: We integrated DVC with the data ingestion, model trainer, and evaluator components. This ensured reproducibility and collaboration among the team members.
Deployed the pipeline to AWS EC2 using containers Docker, AWS ECR, CI/CD tool Jenkins
Built user application with Flask

By the end of this project, we achieved a high level of accuracy in classifying chest diseases from CT scan images. This would significantly improve the early diagnosis and treatment of patients with chest diseases.

Files to update for each component

Update config.yaml # to define constants
Update params.yaml
Update the entity
Update the configuration manager in src config
Update the components
Update the pipeline
Update the main.py
Update the dvc.yaml

Git commands

git add .

git commit -m "Updated"

git push origin main

How to run?

conda create -n chest python=3.8 -y
conda activate chest
pip install -r requirements.txt

Mlflow dagshub connection uri (get this from dagshub.com repository experiment )

MLFLOW_TRACKING_URI= MLFLOW_TRACKING_URI,
MLFLOW_TRACKING_USERNAME= MLFLOW_TRACKING_USERNAME,
MLFLOW_TRACKING_PASSWORD=MLFLOW_TRACKING_PASSWORD

RUN from bash terminal

export MLFLOW_TRACKING_URI= MLFLOW_TRACKING_URI
export MLFLOW_TRACKING_USERNAME= MLFLOW_TRACKING_USERNAME
export MLFLOW_TRACKING_PASSWORD= MLFLOW_TRACKING_PASSWORD

dvc init # initializes dvc (o/p .dvc, .dvcignore files generated)
dvc repro # runs dvc.yaml file -> creates artificats -> dvc.lock
dev dag

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.dvc		.dvc
.github/workflows		.github/workflows
.jenkins		.jenkins
config		config
flowcharts		flowcharts
mlruns/0		mlruns/0
model		model
research		research
scripts		scripts
src/cnnClassifier		src/cnnClassifier
templates		templates
.dockerignore		.dockerignore
.dvcignore		.dvcignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
demo.py		demo.py
docker-compose.yaml		docker-compose.yaml
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
inputImage.jpg		inputImage.jpg
main.py		main.py
params.yaml		params.yaml
project_flow.txt		project_flow.txt
requirements.txt		requirements.txt
scores.json		scores.json
setup.py		setup.py
template.py		template.py
vgg16 architexture.jpg		vgg16 architexture.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chest_Disease_Image_Classification

Flow of End to End robust automatic pipeline:

Files to update for each component

Git commands

How to run?

Mlflow dagshub connection uri (get this from dagshub.com repository experiment )

RUN from bash terminal

About

Releases

Packages

Languages

License

malleswarigelli/Chest_Disease_Image_Classification_

Folders and files

Latest commit

History

Repository files navigation

Chest_Disease_Image_Classification

Flow of End to End robust automatic pipeline:

Files to update for each component

Git commands

How to run?

Mlflow dagshub connection uri (get this from dagshub.com repository experiment )

RUN from bash terminal

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages