Add option to specify "catch" in OptunaSearchCV #162

muhlbach · 2024-09-06T13:05:32Z

Expected behavior

I'm hitting this error: FloatingPointError: underflow encountered in _ndtri_exp_single (vectorized), when I'm fitting an OptunaSearchCV instance with a XGBoostRegressor, where regularization parameters are sampled too low (I think that's the issue). I would love to pass catch exception to the instance, but this is currently not possible the way the scikit-learn interface is desgined:

        if self.study is None:
            seed = random_state.randint(0, np.iinfo("int32").max)
            sampler = samplers.TPESampler(seed=seed)

            self.study_ = study_module.create_study(direction="maximize", sampler=sampler)

        else:
            self.study_ = self.study

        objective = _Objective(
            self.estimator,
            self.param_distributions,
            X_res,
            y_res,
            cv,
            self.enable_pruning,
            self.error_score,
            fit_params_res,
            groups_res,
            self.max_iter,
            self.return_train_score,
            self.scorer_,
        )

        _logger.info(
            "Searching the best hyperparameters using {} "
            "samples...".format(_num_samples(self.sample_indices_))
        )

        self.study_.optimize(
            objective,
            n_jobs=self.n_jobs,
            n_trials=self.n_trials,
            timeout=self.timeout,
            callbacks=self.callbacks,
                                                                        <------------ Could at "catch=self.catch" here
        )

Environment

Optuna version:3.6.1
Optuna Integration version:4.0.0
Python version:3.12.3
OS:Windows-10-10.0.19045-SP0

Error messages, stack traces, or logs

11:29:22 [W 2024-09-06 11:29:16,762] Trial 10 failed with parameters: {} because of the following error: FloatingPointError('underflow encountered in _ndtri_exp_single (vectorized)').

Steps to reproduce

I cannot recreate the bug because of confidential data, but the gist of it is this:

import optuna
from optuna.distributions import FloatDistribution
from sklearn.datasets import make_regression
from xgboost import XGBRegressor

params = dict(reg_alpha=FloatDistribution(low=1e-10, high=1, log=True),
              reg_lambda=FloatDistribution(low=1e-10, high=1, log=True))
model = optuna.integration.OptunaSearchCV(estimator=XGBRegressor(), param_distributions=params)
X,y = make_regression(n_samples=100, n_features=10, noise=10000000)
model.fit(X, y)

Above code fails with using other data.

Additional context (optional)

No response

The text was updated successfully, but these errors were encountered:

muhlbach · 2024-09-09T09:52:46Z

I have discovered that it is not the estimator that fails when calling .fit(), it is the sampler. I see two different patterns:

Pattern 1:

Traceback (most recent call last):
  File "E:\conda\envs\quant\Lib\site-packages\optuna\study\_optimize.py", line 196, in _run_trial
    value_or_values = func(trial)
                      ^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna_integration\sklearn\sklearn.py", line 214, in __call__
    params = self._get_params(trial)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna_integration\sklearn\sklearn.py", line 325, in _get_params
    name: trial._suggest(name, distribution)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\trial\_trial.py", line 629, in _suggest
    param_value = self.study.sampler.sample_independent(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\sampler.py", line 447, in sample_independent
    return self._sample(study, trial, {param_name: param_distribution})[param_name]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\sampler.py", line 487, in _sample
    samples_below = mpe_below.sample(self._rng.rng, self._n_ei_candidates)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\parzen_estimator.py", line 81, in sample
    sampled = self._mixture_distribution.sample(rng, size)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\probability_distributions.py", line 65, in sample
    samples = _truncnorm.rvs(
              ^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\_truncnorm.py", line 215, in rvs
    return ppf(percentiles, a, b) * scale + loc
           ^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\_truncnorm.py", line 194, in ppf
    out[case_left] = ppf_left(q_left, a[case_left], b[case_left])
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\_truncnorm.py", line 182, in ppf_left
    return _ndtri_exp(log_Phi_x)
           ^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\_truncnorm.py", line 170, in _ndtri_exp
    return np.frompyfunc(_ndtri_exp_single, 1, 1)(y).astype(float)

Pattern 2:

Traceback (most recent call last):
  File "E:\conda\envs\quant\Lib\site-packages\optuna\study\_optimize.py", line 196, in _run_trial
    value_or_values = func(trial)
                      ^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna_integration\sklearn\sklearn.py", line 214, in __call__
    params = self._get_params(trial)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna_integration\sklearn\sklearn.py", line 325, in _get_params
    name: trial._suggest(name, distribution)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\trial\_trial.py", line 629, in _suggest
    param_value = self.study.sampler.sample_independent(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\sampler.py", line 447, in sample_independent
    return self._sample(study, trial, {param_name: param_distribution})[param_name]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\sampler.py", line 488, in _sample
    acq_func_vals = self._compute_acquisition_func(samples_below, mpe_below, mpe_above)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\sampler.py", line 528, in _compute_acquisition_func
    log_likelihoods_above = mpe_above.log_pdf(samples)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\parzen_estimator.py", line 86, in log_pdf
    return self._mixture_distribution.log_pdf(transformed_samples)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\conda\envs\quant\Lib\site-packages\optuna\samplers\_tpe\probability_distributions.py", line 121, in log_pdf
    return np.log(np.exp(weighted_log_pdf - max_[:, None]).sum(axis=1)) + max_
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FloatingPointError: underflow encountered in exp

nzw0301 · 2024-11-02T16:15:36Z

The issue is resolved by #163.

muhlbach added the bug Something isn't working label Sep 6, 2024

muhlbach mentioned this issue Sep 6, 2024

Update sklearn.py by addind catch to OptunaSearchCV #163

Merged

nzw0301 closed this as completed Nov 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to specify "catch" in OptunaSearchCV #162

Add option to specify "catch" in OptunaSearchCV #162

muhlbach commented Sep 6, 2024 •

edited by nzw0301

Loading

muhlbach commented Sep 9, 2024

nzw0301 commented Nov 2, 2024

Add option to specify "catch" in OptunaSearchCV #162

Add option to specify "catch" in OptunaSearchCV #162

Comments

muhlbach commented Sep 6, 2024 • edited by nzw0301 Loading

Expected behavior

Environment

Error messages, stack traces, or logs

Steps to reproduce

Additional context (optional)

muhlbach commented Sep 9, 2024

nzw0301 commented Nov 2, 2024

muhlbach commented Sep 6, 2024 •

edited by nzw0301

Loading