-
step_umap()
has gainedinitial
andtarget_weight
arguments. (#213) -
Calling
?tidy.step_*()
now sends you to the documentation forstep_*()
where the outcome is documented. (#216) -
Documentation for tidy methods for all steps has been improved to describe the return value more accurately. (#217)
-
{keras} and {tensorflow} have been moved to Suggests instead of Imports. (#218)
-
step_collapse_stringdist()
will now return predictors as factors. (#204) -
Fixed regression from 1.1.2 in
step_lencode_glm()
where it couldn't be used on multiple columns.
-
The
keep_original_cols
argument has been added tostep_woe()
. This change should mean that every step that produces new columns has thekeep_original_cols
argument. (#194) -
Many internal changes to improve consistency and slight speed increases.
step_pca_sparse()
,step_pca_truncated()
andstep_pca_sparse_bayes()
now returns data unaltered ifnum_comp = 0
. This is done to be consistent with recipes steps of the same nature. (#190)
-
Fixed bug where
step_pca_truncated()
didn't work with zero selection. (#181) -
The tidy() methods for
step_discretize_cart()
,step_discretize_xgb()
,step_embed()
,step_feature_hash()
,step_lencode_bayes()
,step_lencode_glm()
,step_lencode_mixed()
,step_pca_sparse()
,step_pca_sparse_bayes()
,step_pca_truncated()
,step_umap()
, andstep_woe()
now correctly return zero-row tibbles when used with empty selections. (#181)
step_pca_truncated()
has been added. This step only calculates the components that are required, and will be a speedup in cases where it is used on many variables. (#82)
-
step_collapse_stringdist()
has gainedmethod
andoptions
arguments to allow for different types of string distance calculations. (#152) -
step_umap()
has gained the argumentmetric
. (#154) -
step_embed()
has gained thekeep_original_cols
argument. (#176) -
All steps now have
required_pkgs()
methods. -
Steps with tunable arguments now have those arguments listed in the documentation.
-
All steps that add new columns will now informatively error if name collision occurs.
-
step_collapse_cart()
can pool a predictor's factor levels using a tree-based method. -
step_collapse_stringdist()
can pool a predictor's factor levels using string distances. -
Case weights support have been added to
step_discretize_cart()
,step_discretize_xgb()
,step_lencode_bayes()
,step_lencode_glm()
, andstep_lencode_mixed()
.
-
step_embed()
now correctly defaults to have a random id with the word "embed". (#102) -
step_feature_hash()
is soft deprecated in embed in favor ofstep_dummy_hash()
in textrecipes. (#95) -
Steps now have a dedicated subsection detailing what happens when
tidy()
is applied. (#105) -
Reorganize documentation for all recipe step
tidy
methods (#115). -
Fixed a bug where
woe_table()
andstep_woe()
didn't respect the factor levels of the outcome. (109)
-
Re-licensed package from GPL-2 to MIT. See consent from copyright holders here.
-
The tunable parameter ranges for
step_umap()
were changed forneighbors
,num_comp
, andmin_dist
to preventuwot
segmentation faults. The step also check to see if the data dimensions are consistent with the argument values. -
Two new PCA steps were added, each using sparse techniques for estimation:
step_pca_sparse()
andstep_pca_sparse_bayes()
. -
Updated to use
recipes_eval_select()
from recipes 0.1.17 (#85). -
Added
prefix
argument tostep_umap()
to harmonize with other recipes steps (#93). -
All embed recipe steps now officially support empty selections to be more aligned with recipes, dplyr and other packages that use tidyselect.
-
step_woe()
no longer warns about high-cardinality predictors when the recipe is estimated. Instead it warns when categories have fewer than 10 data points in the training set. (#74)
-
Minor release with changes to test for cases when CRAN cannot get
xgboost
to work on their Solaris configuration. -
lme4
andrstanarm
are now in the Suggests list so they are not automatically installed withembed
. A message is written to the console if those packages are missing and their associated steps functions are invoked.
- More changes to enable better parallel processing on windows.
- Changes to enable better parallel processing on windows.
-
Changes to tests to get out of archive jail.
-
Updated the plumbing behind
step_woe()
. -
Due to a bug in
tensorflow
, added a "warm start" to instigate a TF session if one does not currently exist.
- Changes for
dplyr
1.0.0
-
step_discretize_xgb()
andstep_discretize_cart()
can be used to convert numeric predictors to categorical using supervised binning methods based on tree models. Thanks to Konrad Semsch for the contribution. -
Added
step_feature_hash()
for creating dummy variables using feature hashing.
tidy.step_woe()
now has column names consistent with other recipe steps.
- Fixed a bug in detecting the TF version.
- Small changes for base R's
stringsAsFactors
change.
-
The example data are now in the
modeldata
package. -
Small TF updates to
step_embed()
.
-
Methods were added for a future generic called
tunable()
. This outlines which parameters in a step can/could be tuned. -
Small updates to work with different versions of
tidyr
.
step_umap()
was added for both supervised and unsupervised encodings.step_woe()
created weight of evidence encodings.
A mostly maintainence release to be compatible with version 0.1.3 of recipes
.
-
The package now depends on the
generics
pacakge to get thebroom
tidy
methods. -
Karim Lahrichi added the ability to use callbacks when fitting tensorflow models. PR
First CRAN version