Skip to content

Commit

Permalink
2.0.0 (#50)
Browse files Browse the repository at this point in the history
Major upgrade please see release notes
  • Loading branch information
eedorenko authored and dtzar committed Aug 16, 2019
1 parent b013780 commit 41ea9b7
Show file tree
Hide file tree
Showing 76 changed files with 920 additions and 2,897 deletions.
36 changes: 36 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Azure Subscription Variables
WORKSPACE_NAME = ''
RESOURCE_GROUP = ''
SUBSCRIPTION_ID = ''
LOCATION = ''
TENANT_ID = ''

# Azure ML Workspace Variables
EXPERIMENT_NAME = ''
SCRIPT_FOLDER = './'
BLOB_STORE_NAME = ''
# Remote VM Config
REMOTE_VM_NAME = ''
REMOTE_VM_USERNAME = ''
REMOTE_VM_PASSWORD = ''
REMOTE_VM_IP = ''
# AML Compute Cluster Config
AML_CLUSTER_NAME = ''
AML_CLUSTER_VM_SIZE = ''
AML_CLUSTER_MAX_NODES = ''
AML_CLUSTER_MIN_NODES = ''
AML_CLUSTER_PRIORITY = 'lowpriority'
# Training Config
MODEL_NAME = ''
# AML Pipeline Config
TRAINING_PIPELINE_NAME = ''
PIPELINE_CONDA_PATH = 'aml_config/conda_dependencies.yml'
MODEL_PATH = ''
# Image config
IMAGE_NAME = ''
IMAGE_DESCRIPTION = ''
IMAGE_VERSION = ''
# ACI Config
ACI_CPU_CORES = ''
ACI_MEM_GB = ''
ACI_DESCRIPTION = ''
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -103,3 +103,5 @@ venv.bak/

# mypy
.mypy_cache/

.DS_Store
26 changes: 26 additions & 0 deletions .pipelines/azdo-base-pipeline.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# this pipeline should be ignored for now
parameters:
pipelineType: 'training'

steps:
- script: |
flake8 --output-file=$(Build.BinariesDirectory)/lint-testresults.xml --format junit-xml
workingDirectory: '$(Build.SourcesDirectory)'
displayName: 'Run code quality tests'
enabled: 'true'

- script: |
pytest --junitxml=$(Build.BinariesDirectory)/unit-testresults.xml $(Build.SourcesDirectory)/tests/unit
displayName: 'Run unit tests'
enabled: 'true'
env:
SP_APP_SECRET: '$(SP_APP_SECRET)'

- task: PublishTestResults@2
condition: succeededOrFailed()
inputs:
testResultsFiles: '$(Build.BinariesDirectory)/*-testresults.xml'
testRunTitle: 'Linting & Unit tests'
failTaskOnFailedTests: true
displayName: 'Publish linting and unit test results'
enabled: 'true'
45 changes: 45 additions & 0 deletions .pipelines/azdo-ci-build-train.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
pr: none
trigger:
branches:
include:
- master

pool:
vmImage: 'ubuntu-latest'

container: mcr.microsoft.com/mlops/python:latest


variables:
- group: devopsforai-aml-vg


steps:
- template: azdo-base-pipeline.yml

- bash: |
# Invoke the Python building and publishing a training pipeline
python3 $(Build.SourcesDirectory)/ml_service/pipelines/build_train_pipeline.py
failOnStderr: 'false'
env:
SP_APP_SECRET: '$(SP_APP_SECRET)'
displayName: 'Train model using AML with Remote Compute'
enabled: 'true'

- task: CopyFiles@2
displayName: 'Copy Files to: $(Build.ArtifactStagingDirectory)'
inputs:
SourceFolder: '$(Build.SourcesDirectory)'
TargetFolder: '$(Build.ArtifactStagingDirectory)'
Contents: |
ml_service/pipelines/?(run_train_pipeline.py|*.json)
code/scoring/**
- task: PublishBuildArtifacts@1
displayName: 'Publish Artifact'
inputs:
ArtifactName: 'mlops-pipelines'
publishLocation: 'container'
pathtoPublish: '$(Build.ArtifactStagingDirectory)'
TargetPath: '$(Build.ArtifactStagingDirectory)'
18 changes: 18 additions & 0 deletions .pipelines/azdo-pr-build-train.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
trigger: none
pr:
branches:
include:
- master

pool:
vmImage: 'ubuntu-latest'

container: mcr.microsoft.com/mlops/python:latest


variables:
- group: devopsforai-aml-vg


steps:
- template: azdo-base-pipeline.yml
32 changes: 14 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,18 @@
---
page_type: sample
languages:
- python
products:
- azure
- azure-machine-learning-service
- azure-devops
---

# MLOps with Azure ML


[![Build Status](https://dev.azure.com/customai/DevopsForAI-AML/_apis/build/status/Microsoft.MLOpsPython?branchName=master)](https://dev.azure.com/customai/DevopsForAI-AML/_build/latest?definitionId=25&branchName=master)

### Author: Praneet Solanki | Richin Jain

MLOps will help you to understand how to build the Continuous Integration and Continuous Delivery pipeline for a ML/AI project. We will be using the Azure DevOps Project for build and release/deployment pipelines along with Azure ML services for model retraining pipeline, model management and operationalization.

Expand All @@ -25,20 +34,15 @@ To deploy this solution in your subscription, follow the manual instructions in

This reference architecture shows how to implement continuous integration (CI), continuous delivery (CD), and retraining pipeline for an AI application using Azure DevOps and Azure Machine Learning. The solution is built on the scikit-learn diabetes dataset but can be easily adapted for any AI scenario and other popular build systems such as Jenkins and Travis.

![Architecture](/docs/images/Architecture_DevOps_AI.png)
![Architecture](/docs/images/main-flow.png)


## Architecture Flow

### Train Model
1. Data Scientist writes/updates the code and push it to git repo. This triggers the Azure DevOps build pipeline (continuous integration).
2. Once the Azure DevOps build pipeline is triggered, it runs following types of tasks:
- Run for new code: Every time new code is committed to the repo, the build pipeline performs data sanity tests and unit tests on the new code.
- One-time run: These tasks runs only for the first time the build pipeline runs. It will programatically create an [Azure ML Service Workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace), provision [Azure ML Compute](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-set-up-training-targets#amlcompute) (used for model training compute), and publish an [Azure ML Pipeline](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-ml-pipelines). This published Azure ML pipeline is the model training/retraining pipeline.

> Note: The Publish Azure ML pipeline task currently runs for every code change
3. The Azure ML Retraining pipeline is triggered once the Azure DevOps build pipeline completes. All the tasks in this pipeline runs on Azure ML Compute created earlier. Following are the tasks in this pipeline:
2. Once the Azure DevOps build pipeline is triggered, it performs code quality checks, data sanity tests, unit tests, builds an [Azure ML Pipeline](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-ml-pipelines) and publishes it in an [Azure ML Service Workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace).
3. The [Azure ML Pipeline](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-ml-pipelines) is triggered once the Azure DevOps build pipeline completes. All the tasks in this pipeline runs on Azure ML Compute. Following are the tasks in this pipeline:

- **Train Model** task executes model training script on Azure ML Compute. It outputs a [model](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#model) file which is stored in the [run history](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#run).

Expand All @@ -50,16 +54,8 @@ This reference architecture shows how to implement continuous integration (CI),

Once you have registered your ML model, you can use Azure ML + Azure DevOps to deploy it.

The **Package Model** task packages the new model along with the scoring file and its python dependencies into a [docker image](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#image) and pushes it to [Azure Container Registry](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-intro). This image is used to deploy the model as [web service](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#web-service).

The **Deploy Model** task handles deploying your Azure ML model to the cloud (ACI or AKS).
This pipeline deploys the model scoring image into Staging/QA and PROD environments.

In the Staging/QA environment, one task creates an [Azure Container Instance](https://docs.microsoft.com/en-us/azure/container-instances/container-instances-overview) and deploys the scoring image as a [web service](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#web-service) on it.

The second task invokes the web service by calling its REST endpoint with dummy data.
[Azure DevOps release pipeline](https://docs.microsoft.com/en-us/azure/devops/pipelines/release/?view=azure-devops) packages the new model along with the scoring file and its python dependencies into a [docker image](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#image) and pushes it to [Azure Container Registry](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-intro). This image is used to deploy the model as [web service](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#web-service) across QA and Prod environments. The QA environment is running on top of [Azure Container Instances (ACI)](https://azure.microsoft.com/en-us/services/container-instances/) and the Prod environemt is built with [Azure Kubernetes Service (AKS)](https://docs.microsoft.com/en-us/azure/aks/intro-kubernetes).

5. The deployment in production is a [gated release](https://docs.microsoft.com/en-us/azure/devops/pipelines/release/approvals/gates?view=azure-devops). This means that once the model web service deployment in the Staging/QA environment is successful, a notification is sent to approvers to manually review and approve the release. Once the release is approved, the model scoring web service is deployed to [Azure Kubernetes Service(AKS)](https://docs.microsoft.com/en-us/azure/aks/intro-kubernetes) and the deployment is tested.

### Repo Details

Expand Down
50 changes: 0 additions & 50 deletions aml_config/conda_dependencies.yml

This file was deleted.

6 changes: 0 additions & 6 deletions aml_config/config.json

This file was deleted.

15 changes: 0 additions & 15 deletions aml_config/security_config.json

This file was deleted.

64 changes: 0 additions & 64 deletions aml_service/00-WorkSpace.py

This file was deleted.

44 changes: 0 additions & 44 deletions aml_service/01-Experiment.py

This file was deleted.

Loading

0 comments on commit 41ea9b7

Please sign in to comment.