Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a hill-climbing algorithm as an Optuna sampler #122

Open
toshihikoyanase opened this issue Jul 31, 2024 · 5 comments
Open

Add a hill-climbing algorithm as an Optuna sampler #122

toshihikoyanase opened this issue Jul 31, 2024 · 5 comments
Labels
contribution-welcome Contribution welcome issues new-package New packages

Comments

@toshihikoyanase
Copy link
Member

Motivation

Hill climbing is a local search algorithm commonly used for optimization problems. The sampler based on the hill-climbing algorithm aims to provide a simple yet effective way to explore the hyperparameter space and find good solutions.

Description

For simplicity, this hill-climbing sampler will only support the discrete case, where hyperparameters are represented as integers or categorical values.

The key components of the hill-climbing sampler will include:

  • Initialization: This sampler will start from a random point in the hyperparameter space using RandomSampler of Optuna
  • Lists neighboring points: The sampler will maintain two lists: one for neighboring points to be evaluated and another for those that have been evaluated.
  • Move: This sampler will move in the direction of steepest ascent based on the current point and the neighboring points
  • Restart: This sampler will restart the search when no neighboring points improve objective values

Alternatives (optional)

No response

Additional context (optional)

No response

@toshihikoyanase toshihikoyanase added contribution-welcome Contribution welcome issues new-package New packages labels Jul 31, 2024
@toshihikoyanase toshihikoyanase changed the title [WIP] Add a hill-climbing algorithm as an Optuna sampler Add a hill-climbing algorithm as an Optuna sampler Aug 9, 2024
@csking101
Copy link
Contributor

Hi @toshihikoyanase , could I take up this issue?

@toshihikoyanase
Copy link
Member Author

Hi @csking101 .
Thank you for your interest. Please go ahead!

@csking101
Copy link
Contributor

csking101 commented Oct 26, 2024

Hey @toshihikoyanase , so I wanted to share my approach, and get feedback for the same:

  • I will implement a Sampler from the BaseSampler
  • I will store the following variables in that - evaluated_points(list), remaining_points(list), best_points(list)
  • My search space is a discretized version of space in the respective distribution, so I will take a point and its neighbours will be the all the points offset from the current_point in all the distributions (so for d dimensions, there will be 2*d neighbours). Is this correct for the relative sampling method? So pretty much we move one direction at a time, or should it be considering multiple axes' points at a time?
  • The rest of the algorithm remains the same, I will consider which neighbour gives the best value as per the objective, and make that my current_point, I will repeat this until there is no improvement
  • When this happens, I will start a new trial from another random point, with no relation to the previous trials (however, I will store the previous best points from the trials). Should I give some maximum limit for the number of trials?

Is this the right path?

@toshihikoyanase
Copy link
Member Author

@csking101 Thank you for sharing the design of the sampler.

I will implement a Sampler from the BaseSampler

Yes, you can use BaseSampler. Alternatively, you may use SimpleBaseSampler, which simplifies the implementation. This tutorial provides the usage of SimpleBaseSampler.

I will store the following variables in that - evaluated_points(list), remaining_points(list), best_points(list)

Sounds good. evaluated_points might not be necessary since Study holds all evaluated points as Study.trials.

My search space is a discretized version of space in the respective distribution, so I will take a point and its neighbours will be the all the points offset from the current_point in all the distributions (so for d dimensions, there will be 2*d neighbours). Is this correct for the relative sampling method? So pretty much we move one direction at a time, or should it be considering multiple axes' points at a time?

Please move a single axis at a time because this issue targets straightforward hill climbing.
As you mentioned, this approach can involve evaluating a large number of neighbours, it should remain manageable for simple problems.

The rest of the algorithm remains the same, I will consider which neighbour gives the best value as per the objective and make that my current_point, I will repeat this until there is no improvement

Sounds reasonable.

When this happens, I will start a new trial from another random point, with no relation to the previous trials (however, I will store the previous best points from the trials).

Sounds good! Please restart from a randomly chosen point.

Should I give some maximum limit for the number of trials?

There's no need to set a limit within the sampler, as users can specify the maximum number of trials via Study.optimize.

@csking101
Copy link
Contributor

Thank you for the reply, I will open a PR shortly!

@csking101 csking101 mentioned this issue Nov 1, 2024
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution-welcome Contribution welcome issues new-package New packages
Projects
No open projects
Status: Todo
Development

No branches or pull requests

2 participants