Skip to content

Commit

Permalink
fixing tests
Browse files Browse the repository at this point in the history
  • Loading branch information
pplonski committed Apr 9, 2019
1 parent ff1957f commit e959f3c
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 8 deletions.
15 changes: 7 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,17 @@
## The new standard in Machine Learning!

Thanks to Automated Machine Learning you don't need to worry about different machine learning interfaces. You don't need to know all algorithms and their hyper-parameters. With AutoML model tuning and training is painless.

In the current version only binary classification is supported with optimization of LogLoss metric.

## Example
In the current version only binary classification is supported with optimization of LogLoss metric.

```
## Example

```python
import pandas as pd
from supervised.automl import AutoML

df = pd.read_csv("https://raw.githubusercontent.com/pplonski/datasets-for-start/master/adult/data.csv", skipinitialspace=True)
print(df.head())

X = df[df.columns[:-1]]
y = df["income"]

Expand Down Expand Up @@ -58,7 +58,7 @@ This is Automated Machine Learning package, so all hard tasks is done for you. T

#### Train and predict

```
```python
automl = AutoML()
automl.fit(X, y)
predictions = automl.predict(X)
Expand All @@ -76,15 +76,14 @@ By the default, the training should finish in less than 1 hour and as ML algorit
The parameters that you can use to control the training process are:

- **total_time_limit** - it is a total time limit that AutoML can spend for searching to the best ML model. It is in seconds. _Default is set to 3600 seconds._
- **learner_time_limit** - the time limit for training single model, in case of `k`-fold cross validation, the time spend on training is `k*learner_time_limit`. This parameter is only considered when `total_time_limit` is set to None. _Default is set to 120 seconds_.
- **learner_time_limit** - the time limit for training single model, in case of `k`-fold cross validation, the time spend on training is `k*learner_time_limit`. This parameter is only considered when `total_time_limit` is set to None. _Default is set to 120 seconds_.
- **algorithms** - the list of algorithms that will be checked. _Default is set to ["CatBoost", "Xgboost", "RF", "LightGBM", "NN"]_.
- **start_random_models** - the number of models to check with _not so random_ algorithm. _Default is set to 10_.
- **hill_climbing_steps** - number of hill climbing steps used in models tuning. _Default is set to 3_.
- **top_models_to_improve** - number of models considered for improvement in each hill climbing step. _Default is set to 5_.
- **train_ensemble** - decides if ensemble model is trained at the end of AutoML fit procedure. _Default is set to True_.
- **verbose** - controls printouts, _Default is set to True_.


## Development

### Installation
Expand Down
1 change: 1 addition & 0 deletions supervised/automl.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,7 @@ def ensemble_step(self, y):
def fit(self, X, y):
start_time = time.time()
X.reset_index(drop=True, inplace=True)
y = np.array(y)
if not isinstance(y, pd.DataFrame):
y = pd.DataFrame(y)
y.reset_index(drop=True, inplace=True)
Expand Down

0 comments on commit e959f3c

Please sign in to comment.