Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(images): Add MLflow #2635

Merged
merged 1 commit into from
May 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions generated.tf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

82 changes: 82 additions & 0 deletions images/mlflow/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
<!--monopod:start-->
# mlflow
| | |
| - | - |
| **OCI Reference** | `cgr.dev/chainguard/mlflow` |


* [View Image in Chainguard Academy](https://edu.chainguard.dev/chainguard/chainguard-images/reference/mlflow/overview/)
* [View Image Catalog](https://console.enforce.dev/images/catalog) for a full list of available tags.
* [Contact Chainguard](https://www.chainguard.dev/chainguard-images) for enterprise support, SLAs, and access to older tags.*

---
<!--monopod:end-->

<!--overview:start-->
A minimal, [Wolfi](https://github.com/wolfi-dev)-based image for MLflow, an open source platform for the machine learning lifecycle.

<!--overview:end-->

<!--getting:start-->
## Download this Image
The image is available on `cgr.dev`:

```
docker pull cgr.dev/chainguard/mlflow:latest
```
<!--getting:end-->

<!--body:start-->
### MLflow Usage

MLflow's default entrypoint is Python, enabling us to run experiments directly:

```bash
docker run -it cgr.dev/chainguard/mlflow:latest <your experiment>.py
```

Otherwise, we can override the entrypoint and interact with MLflow:

```bash
docker run -it --entrypoint mlflow cgr.dev/chainguard/mlflow:latest <options>
```

### MLflow Tracking Usage

MLflow provides a UI, MLflow Tracking, that allows the user to track 'runs' (the execution of data science code) via visualizations of metrics, parameters, and artifacts.

To start the UI, open a terminal and run:

```bash
docker run -it -p 5000:5000 --entrypoint mlflow cgr.dev/chainguard/mlflow:latest ui
```

While the UI defaults to running on port 5000, you can use a different port via passing `-p <PORT>` as a command line option. Ensure Docker also maps to the correct port.

You should now be able to access the UI at [localhost:5000](http://localhost:5000).

The Tracking API can now be leveraged to record metrics, parameters, and artifacts:

```python
import mlflow

# Set the MLflow tracking URI
mlflow.set_tracking_uri("http://localhost:5000")

# Start an experiment
mlflow.set_experiment("my_experiment")

with mlflow.start_run():
# Log parameters, metrics, and artifacts
mlflow.log_param("param1", value1)
mlflow.log_metric("metric1", value2)
mlflow.log_artifact("path/to/artifact")
# Train and log model
mlflow.sklearn.log_model(model, "model")
```

Ensure that the tracking URI correctly reflects where the MLflow server is running.

For additional documentation covering MLflow Tracking, see the [official docs](https://mlflow.org/docs/latest/tracking.html).

<!--body:end-->
55 changes: 55 additions & 0 deletions images/mlflow/TESTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Testing MLflow

Start off by pulling down the image:

```bash
docker pull cgr.dev/chainguard/mlflow:latest
```

Now we'll run a quick test to ensure MLflow is detected by Python:

```bash
docker run -it --rm cgr.dev/chainguard/mlflow:latest -m mlflow
```

This also validates that we are using the version of Python provided in the virtual environment and not the main Python installation. Because everything is installed within a virtual environment, this is important to verify.

Now we can start MLflow Tracker:

```bash
docker run -it --rm -w $(pwd) -v $(pwd):$(pwd) -p 5000:5000 --entrypoint mlflow --name mlflow cgr.dev/chainguard/mlflow:latest ui --host 0.0.0.0
```

By default, this will start on port 5000. We can override this by running the following:

```bash
docker run -it --rm -w $(pwd) -v $(pwd):$(pwd) -p <PORT>:<PORT> --entrypoint mlflow --name mlflow cgr.dev/chainguard/mlflow:latest ui --host 0.0.0.0 -p <PORT>
```

Logs aren't all too verbose. The important thing you should see is `Listening on: 0.0.0.0:<PORT>`.

Now let's do a quick health check:

```bash
curl -vsL localhost:5000/health
```

The status code should be 200. If all is well, you should be able to access the UI at [localhost:5000](http://localhost:5000).

Now we can test basic functionality of MLflow Tracker. Save this code snippet:

```python
import mlflow

with mlflow.start_run():
for epoch in range(0, 3):
mlflow.log_metric(key="quality", value=2 * epoch, step=epoch)
```

And then execute it:

```bash
docker exec mlflow python ./test.py
```

This will create a run with a random name that should now be viewable in MLflow's UI.
39 changes: 39 additions & 0 deletions images/mlflow/config/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
terraform {
required_providers {
apko = { source = "chainguard-dev/apko" }
}
}

variable "extra_packages" {
description = "Additional packages to install."
type = list(string)
default = ["mlflow"]
}

variable "environment" {
default = {}
}

module "accts" {
source = "../../../tflib/accts"
run-as = 65532
uid = 65532
gid = 65532
name = "nonroot"
}

output "config" {
value = jsonencode({
contents = {
packages = var.extra_packages
}
accounts = module.accts.block
environment = merge({
"PATH" : "/usr/share/mlflow/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
}, var.environment)
entrypoint = {
command = "/usr/share/mlflow/bin/python3"
}
work-dir = "/home/nonroot"
})
}
13 changes: 13 additions & 0 deletions images/mlflow/generated.tf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

38 changes: 38 additions & 0 deletions images/mlflow/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
terraform {
required_providers {
oci = { source = "chainguard-dev/oci" }
}
}

variable "target_repository" {
description = "The docker repo into which the image and attestations should be published."
}

module "config" {
source = "./config"
}

module "latest" {
source = "../../tflib/publisher"
name = basename(path.module)
target_repository = var.target_repository
config = module.config.config
build-dev = true
}

module "test" {
source = "./tests"
digest = module.latest.image_ref
}

resource "oci_tag" "latest" {
depends_on = [module.test]
digest_ref = module.latest.image_ref
tag = "latest"
}

resource "oci_tag" "latest-dev" {
depends_on = [module.test]
digest_ref = module.latest.dev_ref
tag = "latest-dev"
}
13 changes: 13 additions & 0 deletions images/mlflow/metadata.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: mlflow
image: cgr.dev/chainguard/mlflow
logo: https://storage.googleapis.com/chainguard-academy/logos/mlflow.svg
endoflife: ""
console_summary: ""
short_description: |
A minimal, [Wolfi](https://github.com/wolfi-dev)-based image for MLflow, an open source platform for the machine learning lifecycle.
compatibility_notes: ""
readme_file: README.md
upstream_url: https://mlflow.org/
keywords:
- ai
- python
43 changes: 43 additions & 0 deletions images/mlflow/tests/check-mlflow.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#!/usr/bin/env bash

set -o errexit -o nounset -o errtrace -o pipefail -x

# Random port is needed in multi-image test environments
PORT=$(shuf -i 1024-65535 -n 1)
CONTAINER_NAME="mlflow-${PORT}"

# Start MLflow Tracker
docker run \
-d --rm \
-v ./tmp/tests:/tmp/tests \
-p "${PORT}":"${PORT}" \
--name "${CONTAINER_NAME}" \
--entrypoint mlflow \
"${IMAGE_NAME}" \
ui --host 0.0.0.0 -p "${PORT}"

# Stop container when script exits
trap "docker logs "${CONTAINER_NAME}" && docker stop ${CONTAINER_NAME}" EXIT

# Check MLflow Tracker availability
check_ui_status() {
local request_retries=10
local retry_delay=5

# Install curl
apk add curl

# Check availability
for ((i=1; i<=${request_retries}; i++)); do
if [ "$(docker run --network container:"${CONTAINER_NAME}" cgr.dev/chainguard/curl:latest -o /dev/null -s -w "%{http_code}" "http://localhost:${PORT}/health")" -eq 200 ]; then
return 0
fi
sleep "${retry_delay}"
done

echo "FAILED: Did not receive 200 HTTP response from Tracker after ${request_retries} attempts."
exit 1
}

# Run tests
check_ui_status
56 changes: 56 additions & 0 deletions images/mlflow/tests/linear_regression.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
from pprint import pprint

import numpy as np
from sklearn.linear_model import LinearRegression

import mlflow
from mlflow.tracking import MlflowClient


def yield_artifacts(run_id, path=None):
"""Yield all artifacts in the specified run"""
client = MlflowClient()
for item in client.list_artifacts(run_id, path):
if item.is_dir:
yield from yield_artifacts(run_id, item.path)
else:
yield item.path


def fetch_logged_data(run_id):
"""Fetch params, metrics, tags, and artifacts in the specified run"""
client = MlflowClient()
data = client.get_run(run_id).data
# Exclude system tags: https://www.mlflow.org/docs/latest/tracking.html#system-tags
tags = {k: v for k, v in data.tags.items() if not k.startswith("mlflow.")}
artifacts = list(yield_artifacts(run_id))
return {
"params": data.params,
"metrics": data.metrics,
"tags": tags,
"artifacts": artifacts,
}


def main():
# enable autologging
mlflow.sklearn.autolog()

# prepare training data
X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
y = np.dot(X, np.array([1, 2])) + 3

# train a model
model = LinearRegression()
model.fit(X, y)
run_id = mlflow.last_active_run().info.run_id
print(f"Logged data and model in run {run_id}")

# show logged data
for key, data in fetch_logged_data(run_id).items():
print(f"\n---------- logged {key} ----------")
pprint(data)


if __name__ == "__main__":
main()
Loading
Loading