Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vertex Model Registry and Deployer #159

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 8 additions & 0 deletions vertex-registry-and-deployer/.copier-answers.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Changes here will be overwritten by Copier
_commit: 2024.09.24
_src_path: gh:zenml-io/template-starter
email: [email protected]
full_name: ZenML GmbH
open_source_license: apache
project_name: ZenML Starter
version: 0.1.0
2 changes: 2 additions & 0 deletions vertex-registry-and-deployer/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
.venv*
.requirements*
15 changes: 15 additions & 0 deletions vertex-registry-and-deployer/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Apache Software License 2.0

Copyright (c) ZenML GmbH 2024. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
87 changes: 87 additions & 0 deletions vertex-registry-and-deployer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# 🚀 Deploying ML Models with ZenML on Vertex AI


Welcome to your ZenML project for deploying ML models using Google Cloud's Vertex AI! This project provides a hands-on experience with MLOps pipelines using ZenML and Vertex AI. It contains a collection of ZenML steps, pipelines, and other artifacts to help you efficiently deploy your machine learning models.

Using these pipelines, you can run data preparation, model training, registration, and deployment with a single command while using YAML files for [configuration](https://docs.zenml.io/user-guide/production-guide/configure-pipeline). ZenML takes care of tracking your metadata and [containerizing your pipelines](https://docs.zenml.io/how-to/customize-docker-builds).


## 🏃 How to run

In this project, we will train and deploy a classification model to [Vertex AI](https://cloud.google.com/vertex-ai). Before running any pipelines, set up your environment as follows, we need to set up our environment as follows:

```bash
# Set up a Python virtual environment, if you haven't already
python3 -m venv .venv
source .venv/bin/activate

# Install requirements
pip install -r requirements.txt
```

We will need to set up access to Google Cloud and Vertex AI. You can follow the instructions in the [ZenML documentation](https://docs.zenml.io/how-to/auth-management/gcp-service-connector)
to register a service connector and set up your Google Cloud credentials.

Once you have set up your Google Cloud credentials, we can create a stack and run the deployment pipeline:

```bash
# Register the artifact store
zenml artifact-store register gs_store -f gcp --path=gs://bucket-name
zenml artifact-store connect gs_store --connector gcp

# Register the model registry
zenml model-registry register vertex_registry --flavor=vertex --location=us-central1
zenml model-registry connect vertex_registry --connector gcp

# Register Model Deployer
zenml model-deployer register vertex_deployer --flavor=vertex --location=us-central1
zenml model-deployer connect vertex_deployer --connector vertex_deployer_connector

# Register the stack
zenml stack register vertex_stack --orchestrator default --artifact-store gs_store --model-registry vertex_registry --model-deployer vertex_deployer
```

Now that we have set up our stack, we can run the training pipeline, which will train and register the model into the Vertex AI model registry and Deploys it into Vertex AI endpoint.

```bash
python run.py --training-pipeline
```

Once the pipeline has completed, you can check the status of the model in the Vertex AI model registry and the deployed model in the Vertex AI endpoint.

```bash
# List models in the model registry
zenml model-registry models list

# List deployed models
zenml model-deployer models list
```

You can also run the deployment pipeline separately:

```bash
python run.py --inference-pipeline
```


## 📜 Project Structure

The project loosely follows [the recommended ZenML project structure](https://docs.zenml.io/how-to/setting-up-a-project-repository/best-practices):

```
.
├── configs # Pipeline configuration files
│ ├── training.yaml # Configuration for training pipeline
│ ├── inference.yaml # Configuration for inference pipeline
├── pipelines # `zenml.pipeline` implementations
│ ├── training.py # Training pipeline
│ ├── inference.py # Inference pipeline
├── steps # `zenml.step` implementations
│ ├── model_trainer.py # Model training step
│ ├── model_register.py # Model registration step
│ ├── model_promoter.py # Model promotion step
│ ├── model_deployer.py # Model deployment step to Vertex AI
├── README.md # This file
├── requirements.txt # Extra Python dependencies
└── run.py # CLI tool to run pipelines with ZenML # CLI tool to run pipelines on ZenML Stack
```
16 changes: 16 additions & 0 deletions vertex-registry-and-deployer/configs/inference.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# environment configuration
settings:
docker:
required_integrations:
- sklearn
- pandas
requirements:
- pyarrow

# configuration of the Model Control Plane
model:
name: "breast_cancer_classifier"
version: "production"
license: Apache 2.0
description: A breast cancer classifier
tags: ["breast_cancer", "classifier"]
16 changes: 16 additions & 0 deletions vertex-registry-and-deployer/configs/training_sgd.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# environment configuration
settings:
docker:
required_integrations:
- sklearn
- pandas
requirements:
- pyarrow

# configuration of the Model Control Plane
model:
name: breast_cancer_classifier
version: sgd
license: Apache 2.0
description: A breast cancer classifier
tags: ["breast_cancer", "classifier"]
19 changes: 19 additions & 0 deletions vertex-registry-and-deployer/pipelines/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Apache Software License 2.0
#
# Copyright (c) ZenML GmbH 2024. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

from .inference import inference
from .training import training
45 changes: 45 additions & 0 deletions vertex-registry-and-deployer/pipelines/inference.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Apache Software License 2.0
#
# Copyright (c) ZenML GmbH 2024. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from zenml import get_pipeline_context, pipeline
from zenml.logger import get_logger

logger = get_logger(__name__)


@pipeline
def inference(random_state: int, target: str):
"""
Model inference pipeline.

This is a pipeline that loads the inference data, processes it with
the same preprocessing pipeline used in training, and runs inference
with the trained model.

Args:
random_state: Random state for reproducibility.
target: Name of target column in dataset.
"""
# Get the production model artifact
model = get_pipeline_context().model.get_artifact("sklearn_classifier")

# Get the preprocess pipeline artifact associated with this version
preprocess_pipeline = get_pipeline_context().model.get_artifact(
"preprocess_pipeline"
)

# Link all the steps together by calling them and passing the output
# of one step as the input of the next step.
55 changes: 55 additions & 0 deletions vertex-registry-and-deployer/pipelines/training.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Apache Software License 2.0
#
# Copyright (c) ZenML GmbH 2024. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

from typing import Optional
from uuid import UUID

from steps import model_deployer, model_promoter, model_register, model_trainer
from zenml import pipeline
from zenml.client import Client
from zenml.logger import get_logger

logger = get_logger(__name__)


@pipeline
def training(
target: Optional[str] = "target",
):
"""Model training pipeline.

This is a pipeline that loads the data from a preprocessing pipeline,
trains a model on it and evaluates the model. If it is the first model
to be trained, it will be promoted to production. If not, it will be
promoted only if it has a higher accuracy than the current production
model version.

Args:
train_dataset_id: ID of the train dataset produced by feature engineering.
test_dataset_id: ID of the test dataset produced by feature engineering.
target: Name of target column in dataset.
model_type: The type of model to train.
"""
# Link all the steps together by calling them and passing the output
# of one step as the input of the next step.

model, accuracy = model_trainer(target=target)
is_promoted = model_promoter(accuracy=accuracy)
if is_promoted:
model_registry_uri = model_register()
model_deployer(model_registry_uri=model_registry_uri)

5 changes: 5 additions & 0 deletions vertex-registry-and-deployer/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
zenml[server]>=0.70.1
notebook
scikit-learn
pyarrow
pandas
Loading
Loading