Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimal Varianсe Sampling (MVS) booster #5091

Closed
wants to merge 26 commits into from
Closed

Minimal Varianсe Sampling (MVS) booster #5091

wants to merge 26 commits into from

Conversation

StrikerRUS
Copy link
Collaborator

@StrikerRUS StrikerRUS commented Mar 22, 2022

Original PR #4266.
Fixed #2644.

Opening this PR to not accidentally remove underlying branch or forget about it existence.

kruda and others added 2 commits March 13, 2022 00:14
* Added base for minimal variance sampling booster

* Implemented MVS booster with support for multioutput targets, deterministic execution on small datasets/

* Updated documentation and fixed some linting errors

* fixed python sklearn documentation, tryed to fix R Cran CI

* Second attempt to fix R pipeline

* Fixed R package build for windows and linting error

* Revert "Fixed R package build for windows and linting error"

This reverts commit d50769e

* Revert "Revert "Fixed R package build for windows and linting error""

This reverts commit f531f3a.

* Fixed some documentation

* Fixed intendation error in mvs.hpp, fixed some windows build issues, added spinx version upper bound

* Fixed intendation error in mvs.hpp, fixed some windows build issues, added spinx version upper bound

* Update requirements_base.txt

* Update R-package/src/Makevars.in

Co-authored-by: James Lamb <[email protected]>

* Update R-package/src/Makevars.win.in

Co-authored-by: James Lamb <[email protected]>

* Added MVS booster support for dask tests

* Moved CalculateThresholdSequential to array_args.h and renamed it to CalculateThresholdMVS

* Added cpp tests for ArrayArgs::CalculateThresholdMVS and ArrayArgs::Partition.

* Fix linter errors in test_dask.py

* Fixed UB in ArrayArgs::Partition, when it is called with one element.

* Fixed linter errors

* Added more cpp tests and fixed linting errors

* Fixed linting errors

* Updated R-package documentation
Updated documentation
Updated test_mvs_threshold_search.cpp
Added parallel computation of regularized absolute value term.
Added new mvs parameter from constant.

* Updated MVS Lambda algorithm

* Updated documentation, MVS::GetLambda, MVS::GetThreshold, updated MVS::ResetConfig

* [ci] fix current `master` fails with graphviz-related error (#5068)

* Update test_windows.ps1

* Update .appveyor.yml

* Update test_windows.ps1

* Update .appveyor.yml

Co-authored-by: James Lamb <[email protected]>
Co-authored-by: Nikita Titov <[email protected]>
Co-authored-by: Yu Shi <[email protected]>
@shiyu1994 shiyu1994 changed the title [WIP] Minimal Varianсe Sampling (MVS) booster Minimal Varianсe Sampling (MVS) booster Jan 5, 2023
@shiyu1994 shiyu1994 marked this pull request as ready for review January 5, 2023 16:30
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviving this @shiyu1994 ! I left a few minor comments for your consideration, but I'm not qualified to review this thoroughly. Hopefully @guolinke will be able to review.

Please @ me if you need help with the failing CI jobs.

include/LightGBM/config.h Outdated Show resolved Hide resolved
include/LightGBM/config.h Outdated Show resolved Hide resolved
tests/cpp_tests/test_mvs_threshold_search.cpp Outdated Show resolved Hide resolved
src/boosting/mvs.hpp Outdated Show resolved Hide resolved
src/boosting/mvs.cpp Outdated Show resolved Hide resolved
include/LightGBM/config.h Outdated Show resolved Hide resolved
Copy link
Collaborator

@guolinke guolinke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, LGTM except a comment.

@shiyu1994
Copy link
Collaborator

This is still in progress, and I need to add some test cases for MVS before merging this.

Log::Info("Using GOSS");
balanced_bagging_ = false;
bag_data_indices_.resize(num_data_);
#ifdef USE_CUDA_EXP
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised to see new commits referencing cuda_exp. cuda_exp was completely removed in #5677.

Please update this branch to latest master and replace any cuda_exp references with cuda.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reminder. Will merge with the master branch.

@jameslamb
Copy link
Collaborator

It's been over a year since the last commit to this PR, so I'm assuming that you're no longer working on this @shiyu1994. I'm closing this due to lack of activity, to make that clear to anyone else coming to the project interested in working on #2644.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Minimal Variance Sampling in Stochastic Gradient Boosting
5 participants