KFP - Updated the documentation regarding metrics (#2171)

* [WIP] Updated the documentation regarding metrics * Resolved the feedback * Added an alternative python component example * Added a component.yaml example * Apply suggestions from code review Co-authored-by: 8bitmp3 <[email protected]> * Update content/en/docs/pipelines/sdk/pipelines-metrics.md Co-authored-by: 8bitmp3 <[email protected]> Co-authored-by: 8bitmp3 <[email protected]>
kubeflow · Nov 22, 2020 · 75b8f42 · 75b8f42
1 parent bd06c17
commit 75b8f42
Showing 1 changed file with 82 additions and 20 deletions.
diff --git a/content/en/docs/pipelines/sdk/pipelines-metrics.md b/content/en/docs/pipelines/sdk/pipelines-metrics.md
@@ -4,14 +4,10 @@ description = "Export and visualize pipeline metrics"
 weight = 90
 
 +++
-{{% alert title="Out of date" color="warning" %}}
-This guide contains outdated information pertaining to Kubeflow 1.0. This guide
-needs to be updated for Kubeflow 1.1.
-{{% /alert %}}
 
 This page shows you how to export metrics from a Kubeflow Pipelines component. 
 For details about how to build a component, see the guide to 
-[building your own component](/docs/pipelines/sdk/build-component/).
+[building your own component](/docs/pipelines/sdk/component-development/).
 
 ## Overview of metrics
 
@@ -21,38 +17,104 @@ pipeline agent uploads the local file as your run-time metrics. You can view the
 uploaded metrics as a visualization in the **Runs** page for a particular
 experiment in the Kubeflow Pipelines UI.
 
-## Export the metrics file
+## Export the metrics dictionary
 
-To enable metrics, your component must write a JSON file specifying metrics to
-render. The pipeline component must also export a file output artifact with an
-artifact name of `mlpipeline-metrics`, or else the Kubeflow Pipelines UI will
+
+To enable metrics, your component must have an output called `MLPipeline Metrics` and return a JSON-serialized metrics dictionary.
+Otherwise the Kubeflow Pipelines UI will
 not render the visualization. In other words, the `.outputs.artifacts` setting
-for the generated pipeline component should show:
-`- {name: mlpipeline-metrics, path: /mlpipeline-metrics.json}`.
-The JSON filepath does not matter, although `/mlpipeline-metrics.json` is used
-for consistency in the examples below.
+for the generated pipeline template should show:
+`- {name: mlpipeline-metrics, path: /tmp/outputs/mlpipeline_metrics/data}`.
+(The file path does not matter.)
 
-Example JSON content:
+An example Lightweight python component that outputs metrics dictionary by writing it to an output file:
 
 ```Python
-  accuracy = accuracy_score(df['target'], df['predicted'])
+from kfp.components import InputPath, OutputPath, create_component_from_func
+
+def produce_metrics(
+  # Note when the `create_component_from_func` method converts the function to a component, the function parameter "mlpipeline_metrics_path" becomes an output with name "mlpipeline_metrics" which is the correct name for metrics output.
+  mlpipeline_metrics_path: OutputPath('Metrics'),
+):
+  import json
+
+  accuracy = 0.9
   metrics = {
     'metrics': [{
       'name': 'accuracy-score', # The name of the metric. Visualized as the column name in the runs table.
       'numberValue':  accuracy, # The value of the metric. Must be a numeric value.
       'format': "PERCENTAGE",   # The optional format of the metric. Supported values are "RAW" (displayed in raw format) and "PERCENTAGE" (displayed in percentage format).
     }]
   }
-  with file_io.FileIO('/mlpipeline-metrics.json', 'w') as f:
+  with open(mlpipeline_metrics_path, 'w') as f:
     json.dump(metrics, f)
+
+produce_metrics_op = create_component_from_func(
+    produce_metrics,
+    base_image='python:3.7',
+    packages_to_install=[],
+    output_component_file='component.yaml',
+)
+```
+
+Here's an example of a lightweight Python component that outputs a metrics dictionary by returning it from the function:
+
+```Python
+from typing import NamedTuple
+from kfp.components import InputPath, OutputPath, create_component_from_func
+
+def produce_metrics() -> NamedTuple('Outputs', [
+  ('mlpipeline_metrics', 'Metrics'),
+]):
+  import json
+
+  accuracy = 0.9
+  metrics = {
+    'metrics': [{
+      'name': 'accuracy-score', # The name of the metric. Visualized as the column name in the runs table.
+      'numberValue':  accuracy, # The value of the metric. Must be a numeric value.
+      'format': "PERCENTAGE",   # The optional format of the metric. Supported values are "RAW" (displayed in raw format) and "PERCENTAGE" (displayed in percentage format).
+    }]
+  }
+  return [json.dumps(metrics)]
+
+produce_metrics_op = create_component_from_func(
+    produce_metrics,
+    base_image='python:3.7',
+    packages_to_install=[],
+    output_component_file='component.yaml',
+)
 ```
 
-See the 
-[full example](https://github.com/kubeflow/pipelines/blob/master/components/local/confusion_matrix/src/confusion_matrix.py).
+An example script-based `component.yaml` component:
+
+```yaml
+name: Produce metrics
+outputs:
+- {name: MLPipeline Metrics, type: Metrics}
+implementation:
+  container:
+    image: alpine
+    command:
+    - sh
+    - -exc
+    - |
+      output_metrics_path=$0
+      mkdir -p "$(dirname "$output_metrics_path")"
+      echo '{
+        "metrics": [{
+          "name": "accuracy-score",
+          "numberValue": 0.8,
+          "format": "PERCENTAGE"
+        }]
+      }' > "$output_metrics_path"
+    - {outputPath: MLPipeline Metrics}
+```
 
-The metrics file has the following requirements:
+Refer to the [full example](https://github.com/kubeflow/pipelines/blob/master/components/local/confusion_matrix/src/confusion_matrix.py) of a component that generates a confusion matrix data from prediction results.
 
-* `name` must follow the pattern `^[a-zA-Z]([-_a-zA-Z0-9]{0,62}[a-zA-Z0-9])?$`.
+* The output name must be `MLPipeline Metrics` or `MLPipeline_Metrics` (case does not matter).
+* The `name` of each metric must match the following pattern: `^[a-zA-Z]([-_a-zA-Z0-9]{0,62}[a-zA-Z0-9])?$`.
 
     For Kubeflow Pipelines version 0.5.1 or earlier, name must match the following pattern `^[a-z]([-a-z0-9]{0,62}[a-z0-9])?$`
 * `numberValue` must be a numeric value.