Predicting BMI/weight from Health Indicators

This project is part of the Machine Learning Project course at UCL Louvain University, conducted during the academic year 2021/2022.

The main objective is to identify the most effective machine learning model for weight and BMI prediction. By leveraging a training dataset consisting of independent features, we aim to develop a model that can accurately predict weights for given individuals.

Dataset

The dataset consists of two files:

X1.csv: This file contains all the feature variables. The variables include Age, Height, and several other categorical health indicators.
Y1.csv: This file contains the target variable - Weight.

Project Files

The following are the main files included in this project:

BMI_prediction-ToDeliver.ipynb: Contains all the code for data preprocessing, model training, hyperparameter tuning, model evaluation, and visualization for the BMI prediction task.
Weight_prediction-ToDeliver.ipynb : Contains all the code for data preprocessing, model training, hyperparameter tuning, model evaluation, and visualization for the weight prediction task.
X1.csv: Dataset file containing all the feature variables.
Y1.csv: Dataset file containing the target variable.

Methods

In this project, we explored several machine learning models to predict weights. The models tested include:

Multi-Layer Perceptron (MLP)
Linear Regression (LR)
k-Nearest Neighbors (KNN)
Support Vector Machines (SVM)
Multivariate Decision Forests (MDF)

These models were chosen to compare their performance and determine the most suitable approach for weight/BMI prediction based on the given dataset.

Results

BMI prediction without agumentation

BMI prediction with augmentation

Weight prediction without augmentation

Weight prediction with augmentation

Installation

The following libraries are required to run this code:

pandas
matplotlib
numpy
sklearn
scipy
seaborn

Usage

The data is first imported and merged into a single DataFrame.
The merged DataFrame is preprocessed by converting categorical variables into numerical, calculating the BMI from height and weight, and removing anomalies and duplicates.
The processed data is then split into a training set and a testing set.
Features are selected based on their correlation with the target variable.
The data is then passed to various machine learning models - Multi-Layer Perceptron Regressor (MLPRegressor), Linear Regression (LinearRegressor), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Random Forest Regressor.
Hyperparameter tuning is performed on each model using GridSearchCV.
Each model's performance is evaluated based on Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R-squared (r2) metrics.
The performances of the models are also visualized using bar plots.

Contributors

Irene Rigato

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
images		images
BMI_prediction-ToDeliver.ipynb		BMI_prediction-ToDeliver.ipynb
NN.ipynb		NN.ipynb
README.md		README.md
Weight_prediction-ToDeliver.ipynb		Weight_prediction-ToDeliver.ipynb
X1.csv		X1.csv
Y1.csv		Y1.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting BMI/weight from Health Indicators

Dataset

Project Files

Methods