We love scikit learn but very often we find ourselves writing custom transformers, metrics and models. The goal of this project is to attempt to consolidate these into a package that offers code quality/testing. This project is a collaboration between multiple companies in the Netherlands. Note that we're not formally affiliated with the scikit-learn project at all.
Install scikit-lego
via pip with
pip install scikit-lego
Alternatively, to edit and contribute you can fork/clone and run:
$ pip install -e ".[dev]"
The documentation can be found here.
from sklego.transformers import RandomAdder
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
...
mod = Pipeline([
("scale", StandardScaler()),
("random_noise", RandomAdder()),
("model", LogisticRegression(solver='lbfgs'))
])
...
We want to be rather open here in what we accept but we do demand three things before they become added to the project:
- any new feature contributes towards a demonstratable real-world usecase
- any new feature passes standard unit tests (we have a few for transformers and predictors)
- the feature has been discussed in the issue list beforehand