GitHub

Porto Seguro's Safe Driver Challenge hosted by Kaggle

Introduction

In late 2017, Kaggle and Porto Seguro, one of Brazil’s largest auto and homeowner insurance companies, organized a competition where kagglers were challenged to build a model that predicts the probability that a driver will initiate an auto insurance claim in the next year. While Porto Seguro had used machine learning for the past 20 years, they were looking to Kaggle’s machine learning community to explore new, more powerful methods. A more accurate prediction will allow them to further tailor their prices, and hopefully make auto insurance coverage more accessible to more drivers.

Models

In this competition I managed to finish at 33rd rank over 5000+ teams. My solution was based on the following set of models:

LightGBM
XGBoost
Regularized Greedy Forest
Feed-forward neural networks
Field-Aware Factorization Machine, follow the link to the [original kernel by Scirpus]
Follow The Regularized Leader Proximal, follow the link to the [original post and code by Scirpus]
Stochastic Gradient Descent
Ridge Classifier
Logistic Regression

Most of these models used a subset of the dataset features whose selection was performed using my py_ml_utils/feature_selector package

Stacking was done using the linear stacker available here

How to build the solution

Simply clone the repository and run ./build_solution.sh

Dependencies

Python 3.6
Scikit-learn
Pandas
Keras with Theano backend
LightGBM 0.6
XGBoost 2.0.7
LibFFM executables
https://github.com/goldentom42/py_ml_utils
https://github.com/goldentom42/predictor_stacker

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
ftrl_proximal		ftrl_proximal
keras_1layer		keras_1layer
keras_2layers		keras_2layers
lgbm_full_features		lgbm_full_features
libffm		libffm
lightgbm_random_forest		lightgbm_random_forest
logistic_regression		logistic_regression
output_preds		output_preds
regularized_greedy_forest		regularized_greedy_forest
ridge_dummies		ridge_dummies
sgd_interactions		sgd_interactions
stacking		stacking
xgboost_interactions		xgboost_interactions
.gitignore		.gitignore
README.md		README.md
build_solution.sh		build_solution.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Porto Seguro's Safe Driver Challenge hosted by Kaggle

Introduction

Models

How to build the solution

Dependencies

About

Releases

Packages

Languages

goldentom42/kaggle_porto_seguro_2017

Folders and files

Latest commit

History

Repository files navigation

Porto Seguro's Safe Driver Challenge hosted by Kaggle

Introduction

Models

How to build the solution

Dependencies

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages