Releases: evfro/polara
Hotfix Release 0.7.2
Release 0.7.0
A long overdue release, that actually subsumes several release steps (hence, it's not a direct successor of version 0.6.4). Below is a non-exhaustive list of the major changes and improvements for this release.
Models
- HybridSVD is finally in the master branch.
- New wrapper for BPR model from
implicit
library.
Scenarios
- Item cold start support for all SVD-based models.
- Item cold start support for LCE model.
- Item cold start support for LightFM model.
- Computationally-efficient support for negative sampling in evaluation. Uses smart "on-the-fly" sampling and calculation of scores.
Metrics
- MAP and ARHR are now a part of ranking evaluation output.
Tutorials
- Comprehensive tutorial on tuning and evaluation with polara.
- Usecase: comparing LightFM and HybridSVD in cold start.
Features
- It is now possible to provide a dictionary with desired config directly into data model constructor instead of setting each attribute individually after construction.
- Many new convenience functions for data preprocessing/sampling in both dataframe and sparse matrix formats. In particular, there's a new time-aware sampling that ensures no "recommendations from future" are generated due to sampling of last-consumed items.
Improvements
- Better handling of item cold start scenarios.
- Better handling of LCE and LightFM models.
- Better handling of scaled svd for different scenarios.
This release also includes a number of fixes that improve stability and reliability in certain scenarios.
Release 0.6.4
This release introduces a massive update to the framework with new internal design and additional functionality. With this release the long broken support for Python 2 is abandoned and all future releases will be aimed at Python 3 only starting from 3.6 version.
New models and additional functionality
- New Kernelized Probabilistic MF model.
- Built-in support for scaled version of PureSVD (see Reproducing EIGENREC results tutorial for details).
- Simple hybrid model that uses feature-similarity scores aggregation.
- Baseline models for item cold start regime: popularity-based, random, similarity-aggregation model, PureSVD.
- New classes to support item post-filtering.
- Unified handling of side feature-based relations.
- Support for several learning-rate schedules in SGD: adagrad, adam, rmsprop + my own 3 heuristic schedules adanorm, gnprop and gnpropz.
Hyper-parameter tuning
- Generic
find_optimal_config
function to perform random grid search over user-defined hyper-parameter space. - New
find_optimal_svd_rank
routine to quickly and efficiently tune SVD. - New
find_optimal_tucker_ranks
routine to quickly and efficiently tune tensor-based models. - User can now define, which configurations to skip from random grid search.
Evaluation
- New versatile
run_cv_experiment
routine to automate cross-validation experiments. Supports both the default and the user-defined evaluation protocols. - More ways to evaluate against the specific set of metrics supported by Polara.
Performance improvements
- Efficient handling of indices in
LightFM
model (allows to reduce memory load by orders of magnitude comparing to native LightFM implementation). - Rating prediction with tensor-based model is now more efficient.
- Computation of Tucker core in tensor-based models is now optional.
Other improvements
- Revived
Turi Create
(ex Graphlab Create) support with its factorization models includingFactorization Machines
. - Refactored evaluation code.
- Refactored and improved code for SGD-based matrix factorization. Now supports both naive and probabilistic implementations.
- Improved handling of sparse operations.
- Better handling of side features.
- Improved timing functionality.
- Internal naming is now more consistent.
- Support for
Amazon
andEpinions
datasets - Allow unpacking the probe part of the
Netflix
dataset. - Some other minor improvements and fixes.
HotFix Release
Fixes the setup.py file to add LightFM functionality.
Release 0.6.2
The release introduces some performance improvements, extends evaluation metrics and adds new functionality:
New features and convenience functions
- new metric named
coverage
; - support for
LightFM
model; - new
set_config
method, which allows to more easily set desired configuration to a model during grid search experiments;
Performance improvements
- tensor-times-matrix products in the Tucker Decomposition can now be computed in parallel;
Other improvements
- more convenient routine to build one-hot encoding matrix from feature data;
- additional tuning parameter for iALS model;
- more flexible handling of input arguments in plotting functions;
- improved MyMediaLite wrapper + removed some inconsistencies.
Release 0.6.1
This is mostly a bugfix release with several improvements and additions, including:
- the functionality to select data based on some feedback threshold value, controlled by the
feedback_threshold
attribute - improved handling of custom testing scenarios
- new tutorial with example on how to reproduce the EigenRec paper results
- improved messaging on current status
Release 0.6.0
This release provides a number of new features as well as performance improvements:
New features and convenience functions
feedback
parameter can now be omitted inRecommenderData
instances, which simplifies work with purely implicit positive-only data;- separate routine to unfold tensor along a specified mode;
- new random grid search routine
random_grid
inpolara/evaluation/pipelines
- evaluation now allows for parallel execution on test data chunks; it helps to reduce evaluation time in certain cases;
Performance improvements
- tensor rounding is now a part of tensor model, allowing for efficient rank truncation (similarly to SVD) without the need to recompute the whole model;
- computing recommendation scores in the tensor model is now more efficient in terms of both memory and CPU load;
- better handling of iALS algorithm from
implicit
library; now in standard scenario instead of relying on inefficientrecommend
function, the evaluation is performed fully on polara side;
Other improvements
get_movielens_data
now allows to load tags and timestamp data;- HR and MRR metrics con now be calculated independently of the number of holdout items;
- user defined memory usage limit is now a computed value, allowing for dynamic changes;
- many improvements on code readability and naming consistency;
- several bugfixes and a number of other improvements, mostly related to computational efficiency and general workflow control.
Release 0.5.3
The release contains usability improvements and minor issues fixes. The most notable additions are:
- Simplified import of the main models directly from the root of the package.
- New mixins for handling features data in cold-start regime
- Simple content-based model for item cold-start regime.
There are also a few code improvements and fixes.
Release 0.5.2
Changes
Mostly codestyle corrections and several bug fixes
Release 0.5.1
Interoperability between Python 2 and Python 3 is added.