Skip to content
Pascal Vicaire edited this page Nov 8, 2018 · 15 revisions

What is Kubeflow Pipelines?

Kubeflow Pipelines is the platform for building and deploying portable and scalable end-to-end ML workflows, based on containers. The Kubeflow Pipelines platform consists of:

  • User interface for managing and tracking experiments, jobs, and runs
  • Engine for scheduling multi-step ML workflows
  • SDK for defining and manipulating pipelines and components
  • Notebooks for interacting with the system using the SDK

Getting started

Goals of the Kubeflow pipelines service

The Kubeflow pipelines service has the following goals:

  • End to end orchestration: enabling and simplifying the orchestration of end to end machine learning pipelines
  • Easy experimentation: making it easy for you to try numerous ideas and techniques, and manage your various trials/experiments.
  • Easy re-use: enabling you to re-use components and pipelines to quickly cobble together end to end solutions, without having to re-build each time.

The Python code to represent a pipeline workflow graph

@dsl.pipeline(
  name='XGBoost Trainer',
  description='A trainer that does end-to-end distributed training for XGBoost models.'
)
def xgb_train_pipeline(
    output,
    project,
    region='us-central1',
    train_data='gs://ml-pipeline-playground/sfpd/train.csv',
    eval_data='gs://ml-pipeline-playground/sfpd/eval.csv',
    schema='gs://ml-pipeline-playground/sfpd/schema.json',
    target='resolution',
    rounds=200,
    workers=2,
    true_label='ACTION',
):
  delete_cluster_op = DeleteClusterOp('delete-cluster', project, region)
  with dsl.ExitHandler(exit_op=delete_cluster_op):
    create_cluster_op = CreateClusterOp('create-cluster', project, region, output)

    analyze_op = AnalyzeOp('analyze', project, region, create_cluster_op.output,
                           schema, train_data,
                           '%s/{{workflow.name}}/analysis' % output)

    transform_op = TransformOp('transform', project, region,
                               create_cluster_op.output, train_data, eval_data,
                               target, analyze_op.output,
                               '%s/{{workflow.name}}/transform' % output)

    train_op = TrainerOp('train', project, region, create_cluster_op.output,
                         transform_op.outputs['train'],transform_op.outputs['eval'],
                         target, analyze_op.output, workers,
                         rounds, '%s/{{workflow.name}}/model' % output)

    predict_op = PredictOp('predict', project, region, create_cluster_op.output,
                           transform_op.outputs['eval'], train_op.output, target,
                           analyze_op.output,
                           '%s/{{workflow.name}}/predict' % output)

    cm_op = ConfusionMatrixOp('confusion-matrix',
                              predict_op.output,
                              '%s/{{workflow.name}}/confusionmatrix' % output)

    roc_op = RocOp('roc', predict_op.output, true_label,
                   '%s/{{workflow.name}}/roc' % output)

The above pipeline after you've uploaded it

Job

The runtime execution graph of the pipeline

Graph

Outputs from the pipeline

Prediction Output Confusion Matrix Output ROC Output

Developer Guide

Clone this wiki locally