From 9ad0b97ad07d1baab74cabb70be221955004f85b Mon Sep 17 00:00:00 2001 From: Dave Berenbaum Date: Fri, 25 Aug 2023 10:10:54 -0400 Subject: [PATCH] drop metrics/plots stage outputs (#4798) --- content/docs/command-reference/plots/diff.md | 3 - content/docs/command-reference/plots/index.md | 7 +- .../docs/command-reference/plots/modify.md | 167 ---------------- content/docs/command-reference/plots/show.md | 9 +- content/docs/sidebar.json | 4 - .../visualizing-plots.md | 47 ----- .../project-structure/dvcyaml-files.md | 187 ++++++------------ 7 files changed, 66 insertions(+), 358 deletions(-) delete mode 100644 content/docs/command-reference/plots/modify.md diff --git a/content/docs/command-reference/plots/diff.md b/content/docs/command-reference/plots/diff.md index 5c99f3750c..5bc0023102 100644 --- a/content/docs/command-reference/plots/diff.md +++ b/content/docs/command-reference/plots/diff.md @@ -41,9 +41,6 @@ specified with the `--targets` option (any valid plots file is accepted). The plot style can be customized with [plot templates], using the `--template` option. See `dvc plots` to learn more about plots files and templates. -> Note that the default behavior of this command can be modified per metrics -> file with `dvc plots modify`. - Another way to display plots is the `dvc plots show` command, which just lists all the current plots, without comparisons. diff --git a/content/docs/command-reference/plots/index.md b/content/docs/command-reference/plots/index.md index 6e2e436de9..30c581f083 100644 --- a/content/docs/command-reference/plots/index.md +++ b/content/docs/command-reference/plots/index.md @@ -2,14 +2,13 @@ A set of commands to visualize and compare data series or images from ML projects: [show](/doc/command-reference/plots/show), -[diff](/doc/command-reference/plots/diff), -[modify](/doc/command-reference/plots/modify) and +[diff](/doc/command-reference/plots/diff), and [templates](/doc/command-reference/plots/templates). ## Synopsis ```usage -usage: dvc plots [-h] [-q | -v] {show,diff,modify,templates} ... +usage: dvc plots [-h] [-q | -v] {show,diff,templates} ... positional arguments: COMMAND @@ -17,8 +16,6 @@ positional arguments: definitions in `dvc.yaml`. diff Show multiple versions of a plot by overlaying them in a single image. - modify Modify display properties of data-series plots - defined in stages (has no effect on image plots). templates List built-in plots templates or show JSON specification for one. ``` diff --git a/content/docs/command-reference/plots/modify.md b/content/docs/command-reference/plots/modify.md deleted file mode 100644 index 1fe72071e6..0000000000 --- a/content/docs/command-reference/plots/modify.md +++ /dev/null @@ -1,167 +0,0 @@ -# plots modify - -Modify display properties of data-series [plots](/doc/command-reference/plots) -defined in stages. - -> ⚠️ Note that this command can modify only data-series plots. It has no effect -> on image-type plots or any [top-level plot] definitions. - -[top-level plot]: /doc/user-guide/project-structure/dvcyaml-files#plots - -## Synopsis - -```usage -usage: dvc plots modify [-h] [-q | -v] [-t ] [-x ] - [-y ] [--no-header] [--title ] - [--x-label ] [--y-label ] - [--unset [ [ ...]]] - target - -positional arguments: - target Plots file to set properties for - (defined at the stage level) -``` - -## Description - -It might be not convenient for users or automation systems to specify all the -_display properties_ (such as `y-label`, `template`, `title`, etc.) each time -plots are generated with `dvc plots show` or `dvc plots diff`. This command sets -(or unsets) default display properties for a specific plots file. - -The path to the plots file `target` is required. It must be listed in a -`dvc.yaml` file (see the `--plots` option of `dvc stage add`). -`dvc plots modify` adds the display properties to `dvc.yaml`. - -Property names are passed as [options](#options) to this command (prefixed with -`--`). These are based on the [Vega-Lite](https://vega.github.io/vega-lite/) -specification. - -Note that a secondary use of this command is to convert output or simple -`dvc metrics` file into a plots file (see an -[example](#example-convert-any-output-into-a-plot)). - -## Options - -- `-t , --template ` - set a default - [plot template](/doc/user-guide/experiment-management/visualizing-plots#plot-templates-data-series-only). - -- `-x ` - set a default field or column name (or number) from which the X - axis data comes from. - -- `-y ` - set a default field or column name (or number) from which the Y - axis data comes from. - -- `--x-label ` - set a default title for the X axis. - -- `--y-label ` - set a default title for the Y axis. - -- `--title ` - set a default plot title. - -- `--unset [ [ ...]]` - unset one or more display - properties. Use the property name(s) without `--` in the argument sent to this - option. - -- `--no-header` - lets DVC know that the `target` CSV or TSV does not have a - header. A 0-based numeric index can be used to identify each column instead of - names. - -- `-h`, `--help` - prints the usage/help message, and exit. - -- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no - problems arise, otherwise 1. - -- `-v`, `--verbose` - displays detailed tracing information. - -## Examples - -The initial plot was showing the last column of CSV file by default which is -_loss_ metrics while _accuracy_ is expected as Y axis: - -``` -epoch,accuracy,loss -0,0.9403833150863647,0.2019129991531372 -1,0.9733833074569702,0.08973673731088638 -2,0.9815833568572998,0.06529958546161652 -3,0.9861999750137329,0.04984375461935997 -4,0.9882333278656006,0.041892342269420624 -``` - -```cli -$ dvc plots show logs.csv -file:///Users/usr/src/myclassifier/logs.html -``` - -![](/img/plots_mod_loss.svg) - -Changing the y-axis to _accuracy_: - -```cli -$ dvc plots modify logs.csv -y accuracy -$ dvc plots show logs.csv -file:///Users/usr/src/myclassifier/logs.html -``` - -![](/img/plots_mod_acc.svg) - -Note that a new field _y_ was added to `dvc.yaml` file for the plot. Make sure -to commit the change in Git if the modification needs to be preserved. - -```yaml -plots: - - logs.csv: - cache: false - y: accuracy -``` - -Changing the plot `title` and `x-label`: - -```cli -$ dvc plots modify logs.csv --title Accuracy -x epoch --x-label Epoch -$ dvc plots show logs.csv -file:///Users/usr/src/myclassifier/logs.html -``` - -![](/img/plots_mod_acc_titles.svg) - -Two new fields were added to `dvc.yaml`: `x-label` and `title`: - -```yaml -plots: - - plots.csv: - cache: false - y: accuracy - x_label: epoch - title: Accuracy -``` - -## Example: Template change - -Something like `dvc stage add --plots file.csv ...` assigns the default -template, which needs to be changed in many cases. This command can do so: - -```cli -$ dvc plots modify classes.csv --template confusion -``` - -## Example: Convert any output into a plot - -Let's take an example `evaluate` stage which has `logs.csv` as an output. We can -use `dvc plots modify` to convert the `logs.csv` output file into a plots file, -and then confirm the changes that happened in `dvc.yaml`: - -```cli -$ dvc plots modify logs.csv -``` - -```git - evaluate: - cmd: python src/evaluate.py - deps: - - src/evaluate.py -- outs: -- - logs.csv - plots: - - scores.json -+ - logs.csv -``` diff --git a/content/docs/command-reference/plots/show.md b/content/docs/command-reference/plots/show.md index defee7b828..099bb44dcf 100644 --- a/content/docs/command-reference/plots/show.md +++ b/content/docs/command-reference/plots/show.md @@ -30,13 +30,6 @@ All plots defined in `dvc.yaml` are used by default, but you can specify any The plot style can be customized with [plot templates], using the `--template` option. To learn more about plots file formats and templates, see `dvc plots`. - - -The default behavior of this command can be modified per [stage plot] file with -`dvc plots modify`. - - - [certain data]: /doc/user-guide/experiment-management/visualizing-plots#supported-plot-file-formats [plot templates]: @@ -205,7 +198,7 @@ $ dvc plots show --no-header logs.csv -y 2 file:///Users/usr/src/dvc_plots/index.html ``` -## Example: Top-level plots +## Example: `dvc.yaml` plots ### Simple plot definition diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json index 5cbd5b3928..fda47b7801 100644 --- a/content/docs/sidebar.json +++ b/content/docs/sidebar.json @@ -429,10 +429,6 @@ "label": "plots diff", "slug": "diff" }, - { - "label": "plots modify", - "slug": "modify" - }, { "label": "plots templates", "slug": "templates" diff --git a/content/docs/user-guide/experiment-management/visualizing-plots.md b/content/docs/user-guide/experiment-management/visualizing-plots.md index c5a724bc8c..817e31a426 100644 --- a/content/docs/user-guide/experiment-management/visualizing-plots.md +++ b/content/docs/user-guide/experiment-management/visualizing-plots.md @@ -188,53 +188,6 @@ Refer to the [full format specification] and `dvc plots show` for more details. -### Plot outputs - -Plots can use any file defined in the project, including outputs of -[pipelines]: - -```yaml -plots: - - logs.csv: - x: epoch - y: loss -stages: - build: - cmd: python train.py - outs: - - logs.csv - ... -``` - -Alternatively, when defining [pipelines], some outputs (both files -and directories) can be placed under a `plots` list for the corresponding stage -in `dvc.yaml`. This will tell DVC that they are intended for visualization. - - - -When using `dvc stage add`, use `--plots/--plots-no-cache` instead of -`--outs/--outs-no-cache`. - - - -```yaml -stages: - build: - cmd: python train.py - plots: - - logs.csv: - x: epoch - y: loss - ... -``` - -Marking stage outputs as plots is convenient for working with plots at the stage -level, without having to write top-level `plots` definitions in `dvc.yaml`. -However, stage-level plots do not support custom plot IDs or multiple data -sources. - -[pipelines]: /doc/start/data-management/data-pipelines - ## Plot templates (data-series only) DVC uses [Vega-Lite](https://vega.github.io/vega-lite/) JSON specifications to diff --git a/content/docs/user-guide/project-structure/dvcyaml-files.md b/content/docs/user-guide/project-structure/dvcyaml-files.md index c833c65c9a..20880c399d 100644 --- a/content/docs/user-guide/project-structure/dvcyaml-files.md +++ b/content/docs/user-guide/project-structure/dvcyaml-files.md @@ -82,12 +82,6 @@ directory path (relative to the location of `dvc.yaml`) or an arbitrary string. If the ID is an arbitrary string, a file path must be provided in the `y` field (`x` file path is always optional and cannot be the only path provided). -In addition to these "top-level plots," users can mark specific stage -outputs as [plot outputs](#metrics-and-plots-outputs). DVC will -collect both types and display everything conforming to each plot configuration. -If any stage plot files or directories are also used in a top-level definition, -DVC will create separate rendering for each type. - Refer to [Visualizing Plots] and `dvc plots show` for more examples, and refer @@ -99,75 +93,66 @@ to [DVCLive] for a helper to log plots. ### Available configuration fields -- `y` - source for the Y axis data: - - - **Top-level plots** (_string, list, dict_): - - If plot ID is a path, one or more column/field names is expected. For - example: - - ```yaml - plots: - - regression_hist.csv: - y: mean_squared_error - - classifier_hist.csv: - y: [acc, loss] - ``` - - If plot ID is an arbitrary string, a dictionary of file paths mapped to - column/field names is expected. For example: - - ```yaml - plots: - - train_val_test: - y: - train.csv: [train_acc, val_acc] - test.csv: test_acc - ``` - - - **Plot outputs** (_string_): one column/field name. - -- `x` - source for the X axis data. An auto-generated _step_ field is used by - default. - - - **Top-level plots** (_string, dict_): - - If plot ID is a path, one column/field name is expected. For example: - - ```yaml - plots: - - classifier_hist.csv: - y: [acc, loss] - x: epoch - ``` - - If plot ID is an arbitrary string, `x` may either be one column/field name, - or a dictionary of file paths each mapped to one column/field name (the - number of column/field names must match the number in `y`). - - ```yaml - plots: - - train_val_test: # single x - y: - train.csv: [train_acc, val_acc] - test.csv: test_acc - x: epoch - - roc_vs_prc: # x dict - y: - precision_recall.json: precision - roc.json: tpr - x: - precision_recall.json: recall - roc.json: fpr - - confusion: # different x and y paths - y: - dir/preds.csv: predicted - x: - dir/actual.csv: actual - template: confusion - ``` - - - **Plot outputs** (_string_): one column/field name. +- `y` (_string, list, dict_) - source for the Y axis data: + + If plot ID is a path, one or more column/field names is expected. For example: + + ```yaml + plots: + - regression_hist.csv: + y: mean_squared_error + - classifier_hist.csv: + y: [acc, loss] + ``` + + If plot ID is an arbitrary string, a dictionary of file paths mapped to + column/field names is expected. For example: + + ```yaml + plots: + - train_val_test: + y: + train.csv: [train_acc, val_acc] + test.csv: test_acc + ``` + +- `x` (_string, dict_) - source for the X axis data. An auto-generated _step_ + field is used by default. + + If plot ID is a path, one column/field name is expected. For example: + + ```yaml + plots: + - classifier_hist.csv: + y: [acc, loss] + x: epoch + ``` + + If plot ID is an arbitrary string, `x` may either be one column/field name, or + a dictionary of file paths each mapped to one column/field name (the number of + column/field names must match the number in `y`). + + ```yaml + plots: + - train_val_test: # single x + y: + train.csv: [train_acc, val_acc] + test.csv: test_acc + x: epoch + - roc_vs_prc: # x dict + y: + precision_recall.json: precision + roc.json: tpr + x: + precision_recall.json: recall + roc.json: fpr + - confusion: # different x and y paths + y: + dir/preds.csv: predicted + x: + dir/actual.csv: actual + template: confusion + ``` - `y_label` (_string_) - Y axis label. If all `y` data sources have the same field name, that will be the default. Otherwise, it's "y". @@ -175,10 +160,8 @@ to [DVCLive] for a helper to log plots. - `x_label` (_string_) - X axis label. If all `y` data sources have the same field name, that will be the default. Otherwise, it's "x". -- `title` (_string_) - header for the plot(s). Defaults: - - - **Top-level plots**: `path/to/dvc.yaml::plot_id` - - **Plot outputs**: `path/to/data.csv` +- `title` (_string_) - header for the plot(s). Defaults to + `path/to/dvc.yaml::plot_id`. - `template` (_string_) - [plot template]. Defaults to `linear`. @@ -235,7 +218,7 @@ them). -Output files may be viable data sources for [top-level plots](#plots). +Output files may be viable data sources for [plots](#plots). @@ -352,48 +335,6 @@ See also `dvc params diff` to compare params across project version. -### Metrics and Plots outputs - - - -Metrics and plots outputs described below come from earlier versions of DVC and -remain as a convenience. You can instead define metrics and plots separate from -your pipeline with [DVCLive] or add "top-level" [metrics](#metrics) and -[plots](#plots). You can optionally include them as regular `outs` in the -pipeline. - - - -Like common outputs, metrics and plots files are -produced by the stage `cmd`. However, their purpose is different. Typically they -contain metadata to evaluate pipeline processes. Example: - -```yaml -stages: - build: - cmd: python train.py - deps: - - features.csv - outs: - - model.pt - metrics: - - accuracy.json: - cache: false - plots: - - auc.json: - cache: false -``` - - - -`cache: false` is typical here, since they're small enough for Git to store -directly. - - - -The commands in `dvc metrics` and `dvc plots` help you display and compare -metrics and plots. - ## Stage entries These are the fields that are accepted in each stage: @@ -405,8 +346,6 @@ These are the fields that are accepted in each stage: | `deps` | List of dependency paths (relative to `wdir`). | | `outs` | List of output paths (relative to `wdir`). These can contain certain optional [subfields](#output-subfields). | | `params` | List of parameter dependency keys (field names) to track from `params.yaml` (in `wdir`). The list may also contain other parameters file names, with a sub-list of the param names to track in them. | -| `metrics` | List of [metrics files](/doc/command-reference/metrics), and optionally, whether or not this metrics file is cached (`true` by default). See the `--metrics-no-cache` (`-M`) option of `dvc stage add`. | -| `plots` | List of [plot metrics](/doc/command-reference/plots), and optionally, their default configuration (subfields matching the options of `dvc plots modify`), and whether or not this plots file is cached ( `true` by default). See the `--plots-no-cache` option of `dvc stage add`. | | `frozen` | Whether or not this stage is frozen (prevented from execution during reproduction) | | `always_changed` | Causes this stage to be always considered as [changed] by commands such as `dvc status` and `dvc repro`. `false` by default | | `meta` | (Optional) arbitrary metadata can be added manually with this field. Any YAML content is supported. `meta` contents are ignored by DVC, but they can be meaningful for user processes that read or write `.dvc` files directly. |