Skip to content

Commit

Permalink
Update MLflow Helm Chart (following bitnami): (#37)
Browse files Browse the repository at this point in the history
* Update MLflow Helm Chart (following bitnami):
- MLflow version 2.16
- `postgresql.enabled=false` by default (recommended to use CSC PUKKI)
- `minio.enabled=false` by default (recommended to use CSC ALLAS)
- Edit README
- EDIT NOTES: `oc` instead of `kubectl`
- `compatibility.openshift.adaptSecurityContext=auto`. It won't apply the different `SecurityContext`

* Update charts/mlflow/README.md

Co-authored-by: Alvaro Gonzalez <[email protected]>

* Update README and NOTES.txt following suggestions

---------

Co-authored-by: Alvaro Gonzalez <[email protected]>
  • Loading branch information
trispera and lvarin authored Sep 27, 2024
1 parent d0b3840 commit ab196a9
Show file tree
Hide file tree
Showing 29 changed files with 738 additions and 382 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
.DS_Store
Chart.lock
Chart.lock
*.tgz
20 changes: 10 additions & 10 deletions charts/mlflow/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,34 +1,33 @@
# Copyright VMware, Inc.
# Copyright Broadcom, Inc. All Rights Reserved.
# SPDX-License-Identifier: APACHE-2.0

annotations:
category: MachineLearning
licenses: Apache-2.0
images: |
- name: git
image: docker.io/bitnami/git:2.43.0-debian-11-r1
image: docker.io/bitnami/git:2.46.1-debian-12-r1
- name: mlflow
image: docker.io/bitnami/mlflow:2.9.2-debian-11-r0
image: docker.io/bitnami/mlflow:2.16.2-debian-12-r3
- name: os-shell
image: docker.io/bitnami/os-shell:11-debian-11-r92
image: docker.io/bitnami/os-shell:12-debian-12-r30
apiVersion: v2
appVersion: 2.9.2
appVersion: 2.16.2
dependencies:
- condition: minio.enabled
name: minio
repository: oci://registry-1.docker.io/bitnamicharts
version: 12.x.x
version: 14.x.x
- condition: postgresql.enabled
name: postgresql
repository: oci://registry-1.docker.io/bitnamicharts
version: 13.2.28
version: 15.x.x
- name: common
repository: oci://registry-1.docker.io/bitnamicharts
tags:
- bitnami-common
version: 2.x.x
description: MLflow is an open-source platform designed to manage the end-to-end machine learning lifecycle. It allows you to track experiments, package code into reproducible runs, and share and deploy models.
Link to the repo https://github.com/CSCfi/helm-charts
home: https://bitnami.com
icon: https://bitnami.com/assets/stacks/mlflow/img/mlflow-stack-220x234.png
keywords:
Expand All @@ -38,10 +37,11 @@ keywords:
- machine
- learning
maintainers:
- name: VMware, Inc.
- name: Broadcom, Inc. All Rights Reserved.
url: https://github.com/bitnami/charts
name: mlflow
sources:
- https://github.com/bitnami/charts/tree/main/bitnami/mlflow
- https://github.com/bitnami/containers/tree/main/bitnami/mlflow
- https://github.com/mlflow/mlflow
version: 0.4.0
version: 1.5.7
56 changes: 49 additions & 7 deletions charts/mlflow/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,15 @@
## Introduction
This Helm chart deploys MLflow on Rahti2.

It is highly recommended to use the Helm CLI instead of the WebUI of Rahti2. If so, you can clone the GitHub repository from [here](https://github.com/CSCfi/helm-charts).
Helm CLI allows you:
- to download the necessary dependencies in order to run the chart, if you decide to run PostgreSQL and MinIO in Rahti2.
- to set the necessary values (see command below), if you decide to run a PostgreSQL instance externally and to use an external S3 service.

## Test and Deploy
Different steps are necessary to deploy this Helm Chart to Rahti2:

1. If you want to use CSC external S3 service (Allas), be sure to create Allas credentials.
1. By default, this Helm Chart will use the CSC S3 service Allas. Be sure to create Allas credentials.
You can achieve this by [sourcing](https://docs.csc.fi/cloud/pouta/install-client/#configure-your-terminal-environment-for-openstack) your cPouta project and then type this command:

```sh
Expand All @@ -19,36 +24,73 @@ Different steps are necessary to deploy this Helm Chart to Rahti2:

You can also use another external S3 service instead of Allas.

2. Deploy MLflow:
2. By default, it also uses our CSC database service named [Pukki](https://pukki.dbaas.csc.fi). Be sure to have a database created on this service.
During the process of creation of database, it will ask you the `Allowed CIDRs`. Rahti2 has a common egress IP which is `86.50.229.150`. If you want a dedicated egress IP, you can send a ticket to [[email protected]](mailto:[email protected]). More information [here](https://docs.csc.fi/cloud/rahti2/networking/#egress-ips).

A database named `mlflow_auth` must be created when launching your instance. This database is needed for the auth module (only if `tracking.auth.enabled=true` which is the case by default).

3. Deploy MLflow:

```sh
helm install mlflow . --set externalS3.accessKeyID={ACCESS_KEY} --set externalS3.accessKeySecret={SECRET_KEY} --set externalS3.bucket=mlflow
helm install mlflow . --set externalS3.accessKeyID={ACCESS_KEY} \
--set externalS3.accessKeySecret={SECRET_KEY} \
--set externalS3.bucket={BUCKET_NAME} \
--set externalDatabase.host={DB_PUBLIC_IP} \
--set externalDatabase.user={DB_USER} \
--set externalDatabase.password={DB_PASSWORD} \
--set externalDatabase.database={DB_NAME}
```

_Replace {ACCESS_KEY} by the access key previously created_
_Replace {SECRET_KEY} by the secret key previously created_
_Replace {BUCKET_NAME} by the name of the bucket previously created_
_Replace {DB_PUBLIC_IP} by the public IP of your databse created on Pukki_
_Replace {DB_USER} by the user created on Pukki_
_Replace {DB_NAME} by the database created on Pukki_

Alternatively, you can edit the `values.yaml`:

```yaml
[...]
externalDatabase:
host: ''
user: ''
database: ''
password: ''
[...]
externalS3:
accessKeyID: ''
accessKeySecret: ''
bucket: 'mlflow'
bucket: ''
```

To access MLflow tracking webpage, run this command to retrieve `user` password:
After the deployment, the Web URL will be displayed in the NOTES. To access MLflow tracking webpage, run this command to retrieve `user` password:
```sh
echo Password: $(oc get secret --namespace {YOUR_NAMESPACE} mlflow-tracking -o jsonpath="{.data.admin-password }" | base64 -d)
echo Password: $(oc get secret --namespace {YOUR_NAMESPACE} mlflow-tracking -o jsonpath="{ .data.admin-password }" | base64 -d)
```
_Replace {YOUR_NAMESPACE} by the name of your project in Rahti_

You can edit the `config.yaml`. Instead of deleting your deployment and recreating a new one, Helm lets you `upgrade` your release. Use this command:
You can edit the `values.yaml`. Instead of deleting your deployment and recreating a new one, Helm lets you `upgrade` your release. Use this command:
```sh
helm upgrade mlflow . --set externalS3.accessKeyID={ACCESS_KEY} --set externalS3.accessKeySecret={SECRET_KEY} --set externalS3.bucket={BUCKET_NAME}
```

## NOTES
You can use this template by deploying PostgreSQL and MINIO in Rahti2. You can enable these parameters by editing the `values.yaml`:
```yaml
[...]
postgresql:
enabled: true
[...]
minio:
enabled: true
```

**It is highly recommended to use our other services (Pukki and Allas) in a production environment.**

If, for some reasons, the Rahti2 node crashes while you have PostgreSQL and MinIO running, it can cause disruptions and corruption in your database.
Pukki also has automatic backups for your databases.

## Project status

## Links
Expand Down
26 changes: 13 additions & 13 deletions charts/mlflow/templates/NOTES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,11 @@ The chart has been deployed in diagnostic mode. All probes have been disabled an

Get the list of pods by executing:

kubectl get pods --namespace {{ include "common.names.namespace" . | quote }} -l app.kubernetes.io/instance={{ .Release.Name }}
oc get pods --namespace {{ include "common.names.namespace" . | quote }} -l app.kubernetes.io/instance={{ .Release.Name }}

Access the pod you want to debug by executing

kubectl exec --namespace {{ include "common.names.namespace" . | quote }} -ti <NAME OF THE POD> -- bash
oc exec --namespace {{ include "common.names.namespace" . | quote }} -ti <NAME OF THE POD> -- bash

{{- else }}

Expand All @@ -27,19 +27,19 @@ The following command will be executed:
{{- include "common.tplvalues.render" (dict "value" .Values.run.source.launchCommand "context" $) | nindent 2 }}

You can see the logs of each running node with:
kubectl logs [POD_NAME]
oc logs [POD_NAME]

and the list of pods:
kubectl get pods --namespace {{ include "common.names.namespace" . }} -l "app.kubernetes.io/name={{ include "common.names.name" . }},app.kubernetes.io/instance={{ .Release.Name }}"
oc get pods --namespace {{ include "common.names.namespace" . }} -l "app.kubernetes.io/name={{ include "common.names.name" . }},app.kubernetes.io/instance={{ .Release.Name }}"
{{- else }}
You didn't specify any entrypoint to your code.
To run it, you can either deploy again using the `source.launchCommand` option to specify your entrypoint, or execute it manually by jumping into the pods:

1. Get the running pods
kubectl get pods --namespace {{ include "common.names.namespace" . }} -l "app.kubernetes.io/name={{ include "common.names.name" . }},app.kubernetes.io/instance={{ .Release.Name }}"
oc get pods --namespace {{ include "common.names.namespace" . }} -l "app.kubernetes.io/name={{ include "common.names.name" . }},app.kubernetes.io/instance={{ .Release.Name }}"

2. Get into a pod
kubectl exec -ti [POD_NAME] bash
oc exec -ti [POD_NAME] bash

3. Execute your script as you would normally do.
{{- end }}
Expand Down Expand Up @@ -68,21 +68,21 @@ To access your MLflow site from outside the cluster follow the steps below:

{{- if contains "NodePort" .Values.tracking.service.type }}

export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "mlflow.v0.tracking.fullname" . }})
export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
export NODE_PORT=$(oc get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "mlflow.v0.tracking.fullname" . }})
export NODE_IP=$(oc get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
echo "MLflow URL: {{ include "mlflow.v0.tracking.protocol" . }}://$NODE_IP:$NODE_PORT/"

{{- else if contains "LoadBalancer" .Values.tracking.service.type }}

NOTE: It may take a few minutes for the LoadBalancer IP to be available.
Watch the status with: 'kubectl get svc --namespace {{ .Release.Namespace }} -w {{ include "mlflow.v0.tracking.fullname" . }}'
Watch the status with: 'oc get svc --namespace {{ .Release.Namespace }} -w {{ include "mlflow.v0.tracking.fullname" . }}'

export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ include "mlflow.v0.tracking.fullname" . }} --template "{{ "{{ range (index .status.loadBalancer.ingress 0) }}{{ . }}{{ end }}" }}")
export SERVICE_IP=$(oc get svc --namespace {{ .Release.Namespace }} {{ include "mlflow.v0.tracking.fullname" . }} --template "{{ "{{ range (index .status.loadBalancer.ingress 0) }}{{ . }}{{ end }}" }}")
echo "MLflow URL: {{ include "mlflow.v0.tracking.protocol" . }}://$SERVICE_IP{{- if ne $port "80" }}:{{ include "mlflow.v0.tracking.port" . }}{{ end }}/"

{{- else if contains "ClusterIP" .Values.tracking.service.type }}

kubectl port-forward --namespace {{ .Release.Namespace }} svc/{{ include "mlflow.v0.tracking.fullname" . }} {{ include "mlflow.v0.tracking.port" . }}:{{ include "mlflow.v0.tracking.port" . }} &
oc port-forward --namespace {{ .Release.Namespace }} svc/{{ include "mlflow.v0.tracking.fullname" . }} {{ include "mlflow.v0.tracking.port" . }}:{{ include "mlflow.v0.tracking.port" . }} &
echo "MLflow URL: {{ include "mlflow.v0.tracking.protocol" . }}://127.0.0.1{{- if ne $port "80" }}:{{ include "mlflow.v0.tracking.port" . }}{{ end }}//"


Expand All @@ -100,8 +100,8 @@ To access your MLflow site from outside the cluster follow the steps below:
{{- if .Values.tracking.enabled }}
3. Login with the following credentials below to see your blog:

echo Username: $(kubectl get secret --namespace {{ .Release.Namespace }} {{ include "mlflow.v0.tracking.fullname" . }} -o jsonpath="{ .data.{{ include "mlflow.v0.tracking.userKey" . }} }" | base64 -d)
echo Password: $(kubectl get secret --namespace {{ .Release.Namespace }} {{ include "mlflow.v0.tracking.fullname" . }} -o jsonpath="{.data.{{ include "mlflow.v0.tracking.passwordKey" . }} }" | base64 -d)
echo Username: $(oc get secret --namespace {{ .Release.Namespace }} {{ include "mlflow.v0.tracking.fullname" . }} -o jsonpath="{ .data.{{ include "mlflow.v0.tracking.userKey" . }} }" | base64 -d)
echo Password: $(oc get secret --namespace {{ .Release.Namespace }} {{ include "mlflow.v0.tracking.fullname" . }} -o jsonpath="{ .data.{{ include "mlflow.v0.tracking.passwordKey" . }} }" | base64 -d)
{{- end }}
{{- end }}
{{- end }}
Expand Down
Loading

0 comments on commit ab196a9

Please sign in to comment.