Skip to content

Early detecting of lung cancer using the Luna data set with LIDC IDRI annotations using two models nodule classification"Googlent model" and the malignancy classification "Lenet model". This was for kaggle's Data science bowl 2017.

Notifications You must be signed in to change notification settings

abdullahtarek/Early-Detection-of-lung-cancer-using-machine-learning

Repository files navigation

Early-Detection-of-lung-cancer-using-machine-learning

This project is aimed to find nodules in a 3d lung CT-scan and give each nodule a malignancy score between 0 to 4. 0 being not malignant at all and 4 being the most malignant. It used the luna dataset with 2 annatations. The luna Dataset annotations and the LIDC IDRI for the malignancy annotations. The problem had to be devided into two parts because this is a needle in a haystack problem it is not just a simple classification problem.

Pipline


Skewed Data

Incremental training was used in the Googlent model because the Data was skewed. The negatives was way more than the postives so for example I trained on 100 postives and a hundered negatives and then inrementally adding more and more negatives so that the model will not always predict a negative.

Using LIDC IDRI anotations

The LUNA data set is a subset from the LIDC dataset but no previous implementation used that. so using the id in the xml annotations it was found that the same IDs were used. The xml annotations had many features for each nodule but what is used in this project was the malignancy. The nodule positions was refrenced by an edge map and the centroid of the nodule was calculated to generate the training data.

GUI

About

Early detecting of lung cancer using the Luna data set with LIDC IDRI annotations using two models nodule classification"Googlent model" and the malignancy classification "Lenet model". This was for kaggle's Data science bowl 2017.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published