Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to specify "catch" in OptunaSearchCV #162

Closed
muhlbach opened this issue Sep 6, 2024 · 2 comments
Closed

Add option to specify "catch" in OptunaSearchCV #162

muhlbach opened this issue Sep 6, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@muhlbach
Copy link
Contributor

muhlbach commented Sep 6, 2024

Expected behavior

I'm hitting this error: FloatingPointError: underflow encountered in _ndtri_exp_single (vectorized), when I'm fitting an OptunaSearchCV instance with a XGBoostRegressor, where regularization parameters are sampled too low (I think that's the issue). I would love to pass catch exception to the instance, but this is currently not possible the way the scikit-learn interface is desgined:

        if self.study is None:
            seed = random_state.randint(0, np.iinfo("int32").max)
            sampler = samplers.TPESampler(seed=seed)

            self.study_ = study_module.create_study(direction="maximize", sampler=sampler)

        else:
            self.study_ = self.study

        objective = _Objective(
            self.estimator,
            self.param_distributions,
            X_res,
            y_res,
            cv,
            self.enable_pruning,
            self.error_score,
            fit_params_res,
            groups_res,
            self.max_iter,
            self.return_train_score,
            self.scorer_,
        )

        _logger.info(
            "Searching the best hyperparameters using {} "
            "samples...".format(_num_samples(self.sample_indices_))
        )

        self.study_.optimize(
            objective,
            n_jobs=self.n_jobs,
            n_trials=self.n_trials,
            timeout=self.timeout,
            callbacks=self.callbacks,
                                                                        <------------ Could at "catch=self.catch" here
        )

Environment

  • Optuna version:3.6.1
  • Optuna Integration version:4.0.0
  • Python version:3.12.3
  • OS:Windows-10-10.0.19045-SP0

Error messages, stack traces, or logs

11:29:22 [W 2024-09-06 11:29:16,762] Trial 10 failed with parameters: {} because of the following error: FloatingPointError('underflow encountered in _ndtri_exp_single (vectorized)').

Steps to reproduce

I cannot recreate the bug because of confidential data, but the gist of it is this:

import optuna
from optuna.distributions import FloatDistribution
from sklearn.datasets import make_regression
from xgboost import XGBRegressor

params = dict(reg_alpha=FloatDistribution(low=1e-10, high=1, log=True),
              reg_lambda=FloatDistribution(low=1e-10, high=1, log=True))
model = optuna.integration.OptunaSearchCV(estimator=XGBRegressor(), param_distributions=params)
X,y = make_regression(n_samples=100, n_features=10, noise=10000000)
model.fit(X, y)

Above code fails with using other data.

Additional context (optional)

No response

@muhlbach muhlbach added the bug Something isn't working label Sep 6, 2024
@muhlbach
Copy link
Contributor Author

muhlbach commented Sep 9, 2024

I have discovered that it is not the estimator that fails when calling .fit(), it is the sampler. I see two different patterns:

Pattern 1:

Traceback (most recent call last):
  File "E:\conda\envs\quant\Lib\site-packages\optuna\study\_optimize.py", line 196, in _run_trial
    value_or_values = func(trial)
                      ^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna_integration\sklearn\sklearn.py", line 214, in __call__
    params = self._get_params(trial)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna_integration\sklearn\sklearn.py", line 325, in _get_params
    name: trial._suggest(name, distribution)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\trial\_trial.py", line 629, in _suggest
    param_value = self.study.sampler.sample_independent(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\sampler.py", line 447, in sample_independent
    return self._sample(study, trial, {param_name: param_distribution})[param_name]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\sampler.py", line 487, in _sample
    samples_below = mpe_below.sample(self._rng.rng, self._n_ei_candidates)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\parzen_estimator.py", line 81, in sample
    sampled = self._mixture_distribution.sample(rng, size)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\probability_distributions.py", line 65, in sample
    samples = _truncnorm.rvs(
              ^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\_truncnorm.py", line 215, in rvs
    return ppf(percentiles, a, b) * scale + loc
           ^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\_truncnorm.py", line 194, in ppf
    out[case_left] = ppf_left(q_left, a[case_left], b[case_left])
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\_truncnorm.py", line 182, in ppf_left
    return _ndtri_exp(log_Phi_x)
           ^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\_truncnorm.py", line 170, in _ndtri_exp
    return np.frompyfunc(_ndtri_exp_single, 1, 1)(y).astype(float)

Pattern 2:

Traceback (most recent call last):
  File "E:\conda\envs\quant\Lib\site-packages\optuna\study\_optimize.py", line 196, in _run_trial
    value_or_values = func(trial)
                      ^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna_integration\sklearn\sklearn.py", line 214, in __call__
    params = self._get_params(trial)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna_integration\sklearn\sklearn.py", line 325, in _get_params
    name: trial._suggest(name, distribution)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\trial\_trial.py", line 629, in _suggest
    param_value = self.study.sampler.sample_independent(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\sampler.py", line 447, in sample_independent
    return self._sample(study, trial, {param_name: param_distribution})[param_name]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\sampler.py", line 488, in _sample
    acq_func_vals = self._compute_acquisition_func(samples_below, mpe_below, mpe_above)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\sampler.py", line 528, in _compute_acquisition_func
    log_likelihoods_above = mpe_above.log_pdf(samples)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\parzen_estimator.py", line 86, in log_pdf
    return self._mixture_distribution.log_pdf(transformed_samples)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\probability_distributions.py", line 121, in log_pdf
    return np.log(np.exp(weighted_log_pdf - max_[:, None]).sum(axis=1)) + max_
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FloatingPointError: underflow encountered in exp

@nzw0301
Copy link
Member

nzw0301 commented Nov 2, 2024

The issue is resolved by #163.

@nzw0301 nzw0301 closed this as completed Nov 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants