Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parent run missing in MLFlow when using nested trials #153

Open
EvenAR opened this issue Dec 12, 2023 · 3 comments
Open

Parent run missing in MLFlow when using nested trials #153

EvenAR opened this issue Dec 12, 2023 · 3 comments
Labels
feature Change that does not break compatibility, but affects the public interfaces.

Comments

@EvenAR
Copy link

EvenAR commented Dec 12, 2023

Expected behavior

Using MLflowCallback(mlflow_kwargs={"nested": True}) I expect each trial to be grouped under a parent run in MLFlow tracker UI.

Environment

  • Optuna version: 3.4.0
  • Python version: 3.10.13
  • OS: Ubuntu 20.04.6 LTS
  • MLflow version: 2.9.1

Error messages, stack traces, or logs

N/A

Steps to reproduce

mflow_callback = MLflowCallback(
    tracking_uri="http://mlflow-container:1992",
    mlflow_kwargs={"nested": True},
    create_experiment=True
)

@mflow_callback.track_in_mlflow()
def objective(trial):
    # ....

study = optuna.create_study(study_name="My experiment", direction='minimize')
study.optimize(objective, n_trials=2, callbacks=[mflow_callback])

Additional context (optional)

Perhaps my setup is missing something? Or is this there a different recommended way to group hyperparameter runs?

As you can see, each trial appear as individual runs:
image

Thanks!

@EvenAR EvenAR added the bug Something isn't working label Dec 12, 2023
@EvenAR
Copy link
Author

EvenAR commented Dec 14, 2023

One workaround I found is to manually create a parent run and set the mlflow.parentRunId before the trial starts.

Is this the correct way to achieve nested runs? If so it would be helpful if this was described in the docs: https://optuna.readthedocs.io/en/stable/reference/generated/optuna.integration.MLflowCallback.html

def get_or_create_experiment(name, artifact_location=None):
    experiment = mlflow.get_experiment_by_name(name)
    if not experiment:
        experiment_id = mlflow.create_experiment(name, artifact_location)
        experiment = mlflow.get_experiment(experiment_id)
    return experiment

experiment = get_or_create_experiment("My experiment")

with mlflow.start_run(run_name="Hyperparameter tuning", experiment_id=experiment.experiment_id) as parent_run:
    study = optuna.create_study(study_name=experiment.name, direction='minimize', load_if_exists=True)
    mlflow.parentRunId = parent_run.info.run_id
    study.optimize(objective, n_trials=2, callbacks=[mflow_callback])

    mlflow.log_metric("best_value", study.best_value)
    mlflow.log_metrics(study.best_params)

image

@nzw0301
Copy link
Member

nzw0301 commented Apr 16, 2024

Thank you for creating this. I think your workaround makes sense because the current mlflow callback does not create a parent experiment.

@nzw0301
Copy link
Member

nzw0301 commented Apr 16, 2024

If we implement this feature, do you expect the following behaviour?

nested=False

By following the current implementation, mlflow's experiment corresponds to optuna study and mlflow's run corresponds to optuna's trial.

nested=True

We need to change the relationship between optuna and mlflow instances. For example, mlflow run corresponds to optuna study, and mlflow's nested-run (let's say child-run?) corresponds to optuna trial. So in this mode, users need to create a mlflow experiment explicitly or optuna needs to make an experiment given an experiment name.

@nzw0301 nzw0301 added feature Change that does not break compatibility, but affects the public interfaces. and removed bug Something isn't working labels Apr 16, 2024
@nzw0301 nzw0301 transferred this issue from optuna/optuna Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Change that does not break compatibility, but affects the public interfaces.
Projects
None yet
Development

No branches or pull requests

2 participants