Machine Learning project based on detecting patient has diabetes.
Python > 2.7
This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whethera patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.
Note:You do not needto download the data. Two subsetsof data for training and test is created and posted for downloading.The original data set also is included for your further testing and experimenting.
The list of the fields in order of columns in the data file is:
- PregnanciesNumber of times pregnant
- GlucosePlasma glucose concentration a 2 hours in an oral glucose tolerance test
- BloodPressure
- Diastolic blood pressure (mm Hg)
- SkinThicknessTriceps skin fold thickness (mm)
- Insulin2-Hour serum insulin (mu U/ml)
- BMIBody mass index (weight in kg/(height in m)^2)
- DiabetesPedigreeFunctionDiabetes pedigree function
- Age (years)
- OutcomeClass variable (0 - No Diabetes or 1 - Diabetes) 268 of 768 are 1, the others are 0.
For more information see
- Kaggle data-set related to this data.
- NCBI article.
- Model Accuracy - 74.15 %