The current notebook is a solution to a Kaggle competiton problem. It takes in the given training and test datasets and fits a gradient boosting model onto the training data. Next it will output some statistics on how the model will perform, and then predicts the test dataset and outputs submission.csv. This process includes fixing missing data, preprocessing, feature selection, and feature analysis.
There is also an 8 page report attached as a pdf that discusses our methodology and results.
Everything is provided in the jupyter notebook to run yourself as long as you have pandas, numpy, and sklearn installed.