mlflow-practice

Tutorials and personal practice using mlflow with actual cloud storage and a postgres db. I also throw in some hydra for more tool learning, but this could easily be recreated with simple argparse.

For a longer explanation, I made a blog post about it here.

Setup

Requirements

Create the conda environment or otherwise install the dependencies in environment.yml.

conda env create -f environment.yml

Install docker and docker compose: https://docs.docker.com/compose/install/

Running Locally

Spin up services

First, spin up the minio and postgres services:

docker-compose --profile backend up -d

# using make
make tracking-storage

Start the tracking server:

mlflow server \
    --backend-store-uri postgresql://user:password@localhost:5432/mlflow \
    --artifacts-destination s3://mlruns \
    --host 0.0.0.0 \
    --port 5000

# using make
make mlflow-server

Or you can start both using make:

make mlflow-plus-storage

See docker-compose.yaml and Makefile to see environment variables that control the services and their defaults. You can edit them by setting any of the environment variables, preferably through some secrets file, but here it is in bash:

export POSTGRES_USER=user2
export POSTGRES_PASSWORD=password2
export AWS_ACCESS_KEY_ID=minioadmin # minio username
export AWS_SECRET_ACCESS_KEY=minioadmin # minio password

make mlflow-plus-storage

Run Example Experiment

Finally, run the example experiment, setting the MLFLOW_TRACKING_URI environment variable to the address of the mlflow server:

export MLFLOW_TRACKING_URI=http://localhost:5000 # where the mlflow server is running

Then run the example experiment:

# let the script see the mlflow_practice directory, where I define some classes for hydra.
export PYTHONPATH=.

python src/basic-example.py

Running With Remote Storage

To run with remote storage, first spin up a postgres db and minio/s3 service. For example, you can run the same docker-compose file on an AWS ec2 instance (and then set appropriate security rules etc.).

Then, simply change the MLFLOW_S3_ENDPOINT_URL environment variable and --backend-store-uri flag to the appropriate values, e.g.:

export MLFLOW_S3_ENDPOINT_URL=http://<ec2-instance-address-running-minio>:9000
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...

mlflow server \
    --backend-store-uri postgresql://user:password@<ec2-instance-address-running-postgres>:5432/mlflow \
    --artifacts-destination s3://mlruns \
    --host 0.0.0.0 \
    --port 5000

export MLFLOW_TRACKING_URI=http://localhost:5000

python src/basic-example.py

Alternatively, you can run the full docker-compose setup on the ec2 instance and then just set MLFLOW_TRACKING_URI=http://<ec2-instance-address>:5000 before running the script.

# in ec2, just set whatever environment variables you want and then run
docker compose up

and then locally:

export MLFLOW_TRACKING_URI=http://<ec2-instance-address>:5000

python src/basic-example.py

Extra Options With Hydra

The parameters of the training script are controlled by hydra. It is a way to create structured configs, where for instance you have models, dataloaders, etc. that can all be configured in a schmogasbord of ways. You can, for example change from the default of training a random forest to training an SVM:

# let the script see the mlflow_practice directory, where I define some classes for hydra.
export PYTHONPATH=.

# change model
python src/basic-example.py model=svm

# with a different hyperparameter
python src/basic-example.py model=svm model.C=0.1

# or change the experiment name
python src/basic-example.py experiment_name=my_experiment

# start multiple runs with different hyperparameters
python src/basic-example.py model=svm model.C=0.1,1,10

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
cfg		cfg
mlflow_practice		mlflow_practice
src		src
.gitignore		.gitignore
Dockerfile-mlflow		Dockerfile-mlflow
Makefile		Makefile
README.md		README.md
docker-compose.yaml		docker-compose.yaml
entrypoint-mlflow.sh		entrypoint-mlflow.sh
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mlflow-practice

Setup

Requirements

Running Locally

Spin up services

Run Example Experiment

Running With Remote Storage

Extra Options With Hydra

About

Releases

Packages

Languages

clabornd/mlflow-examples

Folders and files

Latest commit

History

Repository files navigation

mlflow-practice

Setup

Requirements

Running Locally

Spin up services

Run Example Experiment

Running With Remote Storage

Extra Options With Hydra

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages