ML-CPT (MACHINE LEARNING-CLEAN PROCESS TRAIN)

Get fast inference on your data, with a range of tools/algorithms at your disposal.

Visualization included. Code snippets to see what's running under the hood, too.

[click here: https://mlcpt.herokuapp.com]

SUPPORTS ONLY CLASSIFICATION TASKS (for now..)

How to Use

use the EXPLORE section

The app expects you provide two separate files(train and test) so be sure to include those
...the rest is handled automatically, but you can fine-tune as you would like.

Features

DateTime feature engineering (datetime columns are automatically selected for you)
Duplicated rows will be dropped
Columns of type object will be transformed to lower case as to help with cleaning and duplicity(ex: Male and MALE ==> male)
ID Field selection for final test file (i.e | ID* | target | | ------:| -----------:| | customerID001 | 1 | | customerID00109 | 0 | | . | . | | . | . |
Target column is automatically collected (looks for "target", "claim", "prediction", "response" as shown above; you can also pick the desired target column)
Retain Missing Data as you want(default is 50%) and you can also choose how to treat the missing data(default is mode)
Visualization
Selecting features to drop
Monotonic or/and unique data dropping (ID has been stored before this)
GetDummies on categorical features (drop_first is True)
Choose scaler (Standardization or Normalization)
Download your processed dataset(train, test)
Auto dataset split (60% - Train, 30% - Validation and 10% - Test)
(NEW*) SMOTE, RANDOM OVER/UNDER SAMPLER FOR IMBALANCED DATASET!
Algorithm selection (Catboost, Knn, RandomForest and Xgboost)
Detailed report on prediction
Save Test prediction to obtain a baseline model score on your Hackathon

CONTRIBUTION

If your dataset fails to parse correctly then it's a sign for you to contribute to the project. Be sure to checkout on a new branch for any feature/fix you add. :)

To contribute: here

PERFORMANCE

Did you get a score with our test prediction? Kindly include that below(screenshot or in writing)

UmojaHack Nigeria: AXA Vehicle Insurance Claim Challenge by UmojaHack Africa (on ZINDI) - ~37.6% (72nd ranking) [base selections]

UNSUPPORTED DATASET

If you have a dataset(classification based) that fails using this app, kindly include it here as a PR.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
sample_data		sample_data
src		src
.gitignore		.gitignore
.slugignore		.slugignore
Procfile		Procfile
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
runtime.txt		runtime.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML-CPT (MACHINE LEARNING-CLEAN PROCESS TRAIN)

How to Use

Features

CONTRIBUTION

PERFORMANCE

UNSUPPORTED DATASET

About

Releases

Packages

Languages

iameo/ml-cpt

Folders and files

Latest commit

History

Repository files navigation

ML-CPT (MACHINE LEARNING-CLEAN PROCESS TRAIN)

How to Use

Features

CONTRIBUTION

PERFORMANCE

UNSUPPORTED DATASET

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages