MLflow for Experiment Tracking

What is MLflow?

MLflow is an open-source platform designed to manage the end-to-end machine learning lifecycle. It simplifies and accelerates the process of developing and deploying machine learning models by providing a suite of tools for experiment tracking, model packaging, and model deployment.

Databricks CE MLflow service is a free MLflow tracking server provided by Databricks. The vast majority of MLflow functionality is supported (with the notable exception that you cannot create serving endpoints on CE, so deployment of models is not supported). A self-managed or local MLflow server can be setup see instructions.

Experiment

For the purpose of the experiment, a time series call forecasting model is trained. XGBoost is tuned with several different hyperparameters using time series cross validation on SKLearn. All metrics are logged in the Databricks Community Edition (CE) which has MLflow service.

Prerequisites & Setup

Setup MLflow in Databricks: Follow the link, and complete the signing up process.
Login to databricks which should bring you to this page. On the left pane, go to Machine Learning > Experiments then Create Experiment on the top right.
Setup an experiment on the server called call_forecasting

Run

Google Collab

Setup DatabricksUserName in Google Collab Secrets. Refer to this article on how to store the keys.
Input your databricks username and password when prompted after executing the cell !databricks configure --host https://community.cloud.databricks.com/
Run the remaining cells and observe the experiment results in the Experiment window in Databricks

Expected Results

The experiment metrics are logged in MLflow as the model trains. Run Name can be configured in the python script to better reflect each run. For each model the MSE, MAE, and MAPE values are logged and can be visualised in the Chart tab. The Run Name = best_model is the model with the lowest MSE. All model artifacts including dependency files are created e.g. conda.yaml, requirements.txt, etc.

Databricks_Results.mp4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MLflow for Experiment Tracking

What is MLflow?

Experiment

Prerequisites & Setup

Run

Google Collab

Expected Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

MLflow for Experiment Tracking

What is MLflow?

Experiment

Prerequisites & Setup

Run

Google Collab

Expected Results