Skip to content

Commit

Permalink
add test metrics in ci capability (#5774)
Browse files Browse the repository at this point in the history
this PR adds the new feature/functionality of testing metrics in a ci
job and how to enable this.

[docs
project](https://www.notion.so/dbtlabs/Metrics-in-CI-c88adcf7e5d5478b89b6700484be856a)

[prd](https://www.notion.so/dbtlabs/Adding-metrics-to-CI-querying-metrics-in-the-IDE-1e48b9cfabd2401797eaa71cd72bf2e4?pvs=4)

[ ] Needs PM review @Jstein77
  • Loading branch information
mirnawong1 authored Jul 16, 2024
1 parent a5fc8dc commit d1cf3e3
Show file tree
Hide file tree
Showing 16 changed files with 122 additions and 11 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ pagination_next: null

- 🗺️ Use these best practices to map out your team's plan to **incrementally adopt the Semantic Layer**.
- 🤗 Get involved in the community and ask questions, **help craft best practices**, and share your progress in building a dbt Semantic Layer.
- [Validate semantic nodes in CI](/docs/deploy/ci-jobs#semantic-validations-in-ci) to ensure code changes made to dbt models don't break these metrics.

The dbt Semantic Layer is the biggest paradigm shift thus far in the young practice of analytics engineering. It's ready to provide value right away, but is most impactful if you move your project towards increasing normalization, and allow MetricFlow to do the denormalization for you with maximum dimensionality.

Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/build/saved-queries.md
Original file line number Diff line number Diff line change
Expand Up @@ -230,5 +230,5 @@ To include all saved queries in the dbt build run, use the [`--resource-type` fl
</detailsToggle>

## Related docs

- [Validate semantic nodes in a CI job](/docs/deploy/ci-jobs#semantic-validations-in-ci)
- Configure [caching](/docs/use-dbt-semantic-layer/sl-cache)
3 changes: 3 additions & 0 deletions website/docs/docs/dbt-versions/release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ Release notes are grouped by month for both multi-tenant and virtual private clo

[^*] The official release date for this new format of release notes is May 15th, 2024. Historical release notes for prior dates may not reflect all available features released earlier this year or their tenancy availability.

## July 2024
- **New**: Introduced Semantic validations in CI pipelines. Automatically test your semantic nodes (metrics, semantic models, and saved queries) during code reviews by adding warehouse validation checks in your CI job using the `dbt sl validate` command. You can also validate modified semantic nodes to guarantee code changes made to dbt models don't break these metrics. Refer to [Semantic validations in CI](/docs/deploy/ci-jobs#semantic-validations-in-ci) to learn about the additional commands and use cases.

## June 2024
- **New:** Introduced new granularity support for cumulative metrics in MetricFlow. Granularity options for cumulative metrics are slightly different than granularity for other metric types. For other metrics, we use the `date_trunc` function to implement granularity. However, because cumulative metrics are non-additive (values can't be added up), we can't use the `date_trunc` function to change their time grain granularity.

Expand Down
87 changes: 86 additions & 1 deletion website/docs/docs/deploy/ci-jobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ You can set up [continuous integration](/docs/deploy/continuous-integration) (CI

dbt Labs recommends that you create your CI job in a dedicated dbt Cloud [deployment environment](/docs/deploy/deploy-environments#create-a-deployment-environment) that's connected to a staging database. Having a separate environment dedicated for CI will provide better isolation between your temporary CI schema builds and your production data builds. Additionally, sometimes teams need their CI jobs to be triggered when a PR is made to a branch other than main. If your team maintains a staging branch as part of your release process, having a separate environment will allow you to set a [custom branch](/faqs/environments/custom-branch-settings) and, accordingly, the CI job in that dedicated environment will be triggered only when PRs are made to the specified custom branch. To learn more, refer to [Get started with CI tests](/guides/set-up-ci).


### Prerequisites
- You have a dbt Cloud account.
- For the [Concurrent CI checks](/docs/deploy/continuous-integration#concurrent-ci-checks) and [Smart cancellation of stale builds](/docs/deploy/continuous-integration#smart-cancellation) features, your dbt Cloud account must be on the [Team or Enterprise plan](https://www.getdbt.com/pricing/).
Expand Down Expand Up @@ -77,6 +76,92 @@ If you're not using dbt Cloud’s native Git integration with [GitHub](/docs/cl
- `non_native_pull_request_id` (for example, BitBucket)
- Provide the `git_sha` or `git_branch` to target the correct commit or branch to run the job against.

## Semantic validations in CI <Lifecycle status="team,enterprise" />

Automatically test your semantic nodes (metrics, semantic models, and saved queries) during code reviews by adding warehouse validation checks in your CI job, guaranteeing that any code changes made to dbt models don't break these metrics.

To do this, add the command `dbt sl validate --select state:modified+` in the CI job. This ensures the validation of modified semantic nodes and their downstream dependencies.

- Testing semantic nodes in a CI job supports deferral and selection of semantic nodes.
- It allows you to catch issues early in the development process and deliver high-quality data to your end users.
- Semantic validation executes an explain query in the data warehouse for semantic nodes to ensure the generated SQL will execute.
- For semantic nodes and models that aren't downstream of modified models, dbt Cloud defers to the production models

To learn how to set this up, refer to the following steps:

1. Navigate to the **Job setting** page and click **Edit**.
2. Add the `dbt sl validate --select state:modified+` command under **Commands** in the **Execution settings** section. The command uses state selection and deferral to run validation on any semantic nodes downstream of model changes. To reduce job times, we recommend only running CI on modified semantic models.
3. Click **Save** to save your changes.

There are additional commands and use cases described in the [next section](#use-cases), such as validating all semantic nodes, validating specific semantic nodes, and so on.

<Lightbox src="/img/docs/dbt-cloud/deployment/ci-dbt-sl-validate-downstream.jpg" width="90%" title="Validate semantic nodes downstream of model changes in your CI job." />

### Use cases

Use or combine different selectors or commands to validate semantic nodes in your CI job. Semantic validations in CI supports the following use cases:

<Expandable alt_header="Semantic nodes downstream of model changes (recommended)" >

To validate semantic nodes that are downstream of a model change, add the two commands in your job **Execution settings** section:

```bash
dbt build --select state:modified+
dbt sl validate --select state:modified+
```

- The first command builds the modified models.
- The second command validates the semantic nodes downstream of the modified models.

Before running semantic validations, dbt Cloud must build the modified models. This process ensures that downstream semantic nodes are validated using the CI schema through the dbt Semantic Layer API.

For semantic nodes and models that aren't downstream of modified models, dbt Cloud defers to the production models.

<Lightbox src="/img/docs/dbt-cloud/deployment/ci-dbt-sl-validate-downstream.jpg" width="90%" title="Validate semantic nodes downstream of model changes in your CI job." />

</Expandable>

<Expandable alt_header="Semantic nodes that are modified or affected by downstream modified nodes.">

To only validate modified semantic nodes, use the following command (with [state selection](/reference/node-selection/syntax#stateful-selection)):

```bash
dbt sl validate --select state:modified+
```

<Lightbox src="/img/docs/dbt-cloud/deployment/ci-dbt-sl-validate-modified.jpg" width="90%" title="Use state selection to validate modified metric definition models in your CI job." />

This will only validate semantic nodes. It will use the defer state set configured in your orchestration job, deferring to your production models.

</Expandable>

<Expandable alt_header="Select specific semantic nodes">

Use the selector syntax to select the _specific_ semantic node(s) you want to validate:

```bash
dbt sl validate --select metric:revenue
```

<Lightbox src="/img/docs/dbt-cloud/deployment/ci-dbt-sl-validate-select.jpg" width="90%" title="Use state selection to validate modified metric definition models in your CI job." />

In this example, the CI job will validate the selected `metric:revenue` semantic node. To select multiple semantic nodes, use the selector syntax: `dbt sl validate --select metric:revenue metric:customers`.

If you don't specify a selector, dbt Cloud will validate all semantic nodes in your project.

</Expandable>

<Expandable alt_header="Select all semantic nodes">

To validate _all_ semantic nodes in your project, add the following command to defer to your production schema when generating the warehouse validation queries:

```bash
dbt sl validate
```

<Lightbox src="/img/docs/dbt-cloud/deployment/ci-dbt-sl-validate-all.jpg" width="90%" title="Validate all semantic nodes in your CI job by adding the command: 'dbt sl validate' in your job execution settings." />

</Expandable>

## Troubleshooting

Expand Down
4 changes: 2 additions & 2 deletions website/docs/docs/deploy/continuous-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@ Using CI helps:

## How CI works

When you [set up CI jobs](/docs/deploy/ci-jobs#set-up-ci-jobs), dbt Cloud listens for notification from your Git provider indicating that a new PR has been opened or updated with new commits. When dbt Cloud receives one of these notifications, it enqueues a new run of the CI job.
When you [set up CI jobs](/docs/deploy/ci-jobs#set-up-ci-jobs), dbt Cloud listens for notification from your Git provider indicating that a new PR has been opened or updated with new commits. When dbt Cloud receives one of these notifications, it enqueues a new run of the CI job.

dbt Cloud builds and tests the models affected by the code change in a temporary schema, unique to the PR. This process ensures that the code builds without error and that it matches the expectations as defined by the project's dbt tests. The unique schema name follows the naming convention `dbt_cloud_pr_<job_id>_<pr_id>` (for example, `dbt_cloud_pr_1862_1704`) and can be found in the run details for the given run, as shown in the following image:
dbt Cloud builds and tests models, semantic models, metrics, and saved queries affected by the code change in a temporary schema, unique to the PR. This process ensures that the code builds without error and that it matches the expectations as defined by the project's dbt tests. The unique schema name follows the naming convention `dbt_cloud_pr_<job_id>_<pr_id>` (for example, `dbt_cloud_pr_1862_1704`) and can be found in the run details for the given run, as shown in the following image:

<Lightbox src="/img/docs/dbt-cloud/using-dbt-cloud/using_ci_dbt_cloud.png" width="90%"title="Viewing the temporary schema name for a run triggered by a PR"/>

Expand Down
4 changes: 3 additions & 1 deletion website/docs/docs/deploy/deploy-environments.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,9 @@ In dbt Cloud, each project can have one designated deployment environment, which

### Semantic Layer

For Semantic Layer-eligible customers, the next section of environment settings is the Semantic Layer configurations. [The Semantic Layer setup guide](/docs/use-dbt-semantic-layer/setup-sl) has the most up-to-date setup instructions!
For customers using the dbt Semantic Layer, the next section of environment settings is the Semantic Layer configurations. [The Semantic Layer setup guide](/docs/use-dbt-semantic-layer/setup-sl) has the most up-to-date setup instructions.

You can also leverage the dbt Job scheduler to [validate your semantic nodes in a CI job](/docs/deploy/ci-jobs#semantic-validations-in-ci) to ensure code changes made to dbt models don't break these metrics.

## Staging environment

Expand Down
6 changes: 2 additions & 4 deletions website/docs/docs/deploy/job-commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ Every job invocation automatically includes the [`dbt deps`](/reference/commands

**Job outcome** &mdash; During a job run, the built-in commands are "chained" together. This means if one of the run steps in the chain fails, then the next commands aren't executed, and the entire job fails with an "Error" job status.


<Lightbox src="/img/docs/dbt-cloud/using-dbt-cloud/fail-dbtdeps.jpg" width="85%" title="A failed job that had an error during the dbt deps run step."/>

### Checkbox commands
Expand All @@ -49,9 +48,8 @@ You can add or remove as many dbt commands as necessary for every job. However,
Use [selectors](/reference/node-selection/syntax) as a powerful way to select and execute portions of your project in a job run. For example, to run tests for one_specific_model, use the selector: `dbt test --select one_specific_model`. The job will still run if a selector doesn't match any models.

:::


**Job outcome** &mdash; During a job run, the commands are "chained" together and executed as run steps. If one of the run steps in the chain fails, then the subsequent steps aren't executed, and the job will fail.

**Job outcome** &mdash; During a job run, the commands are "chained" together and executed as run steps. If one of the run steps in the chain fails, then the subsequent steps aren't executed, and the job will fail.

In the following example image, the first four run steps are successful. However, if the fifth run step (`dbt run --select state:modified+ --full-refresh --fail-fast`) fails, then the next run steps aren't executed, and the entire job fails. The failed job returns a non-zero [exit code](/reference/exit-codes) and "Error" job status:

Expand Down
1 change: 1 addition & 0 deletions website/docs/docs/use-dbt-semantic-layer/exports.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,5 +203,6 @@ To include all saved queries in the dbt build run, use the [`--resource-type` fl
</detailsToggle>

## Related docs
- [Validate semantic nodes in a CI job](/docs/deploy/ci-jobs#semantic-validations-in-ci)
- Configure [caching](/docs/use-dbt-semantic-layer/sl-cache)
- [dbt Semantic Layer FAQs](/docs/use-dbt-semantic-layer/sl-faqs)
8 changes: 7 additions & 1 deletion website/docs/docs/use-dbt-semantic-layer/setup-sl.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,16 @@ import SlSetUp from '/snippets/_new-sl-setup.md';
8. You’re done 🎉! The semantic layer should is now enabled for your project.
-->

## Next steps

- Now that you've set up the dbt Semantic Layer, start querying your metrics with the [available integrations](/docs/cloud-integrations/avail-sl-integrations).
- [Optimize querying performance](/docs/use-dbt-semantic-layer/sl-cache) using declarative caching.
- [Validate semantic nodes in CI](/docs/deploy/ci-jobs#semantic-validations-in-ci) to ensure code changes made to dbt models don't break these metrics.
- If you haven't already, learn how to [build you metrics and semantic models](/docs/build/build-metrics-intro) in your development tool of choice.

## Related docs

- [Build your metrics](/docs/build/build-metrics-intro)
- [Available integrations](/docs/cloud-integrations/avail-sl-integrations)
- [Semantic Layer APIs](/docs/dbt-cloud-apis/sl-api-overview)
- [Get started with the dbt Semantic Layer](/guides/sl-snowflake-qs)
- [dbt Semantic Layer FAQs](/docs/use-dbt-semantic-layer/sl-faqs)
2 changes: 2 additions & 0 deletions website/docs/docs/use-dbt-semantic-layer/sl-cache.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,8 @@ If an upstream model has data in it that was created after the cache was created

You can manually invalidate the cache through the [dbt Semantic Layer APIs](/docs/dbt-cloud-apis/sl-api-overview) using the `InvalidateCacheResult` field.


## Related docs
- [Validate semantic nodes in CI](/docs/deploy/ci-jobs#semantic-validations-in-ci)
- [Saved queries](/docs/build/saved-queries)
- [dbt Semantic Layer FAQs](/docs/use-dbt-semantic-layer/sl-faqs)
9 changes: 9 additions & 0 deletions website/docs/docs/use-dbt-semantic-layer/sl-faqs.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,15 @@ Yes, we approach this by specifying a [dimension](/docs/build/dimensions) that a
Yes, while [entities](/docs/build/entities) must be defined under “entities,” they can be queried like dimensions in downstream tools. Additionally, if the entity isn't used to perform joins across your semantic models, you may optionally define it as a dimension.
</Expandable>

<Expandable alt_header="Can I test my semantic models and metrics?">

Yes! You can validate your semantic nodes (semantic models, metrics, saved queries) in a few ways:

- [Query and validate you metrics](/docs/build/metricflow-commands) in your development tool before submitting your code changes.
- [Validate semantic nodes in CI](/docs/deploy/ci-jobs#semantic-validations-in-ci) to ensure code changes made to dbt models don't break these metrics.

</Expandable>

## Available integrations

<Expandable alt_header="What integrations are supported today?">
Expand Down
6 changes: 5 additions & 1 deletion website/docs/guides/set-up-ci.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,10 @@ In the Execution Settings, your command will be preset to `dbt build --select st

To be able to find modified nodes, dbt needs to have something to compare against. dbt Cloud uses the last successful run of any job in your Production environment as its [comparison state](/reference/node-selection/syntax#about-node-selection). As long as you identified your Production environment in Step 2, you won't need to touch this. If you didn't, pick the right environment from the dropdown.

:::info Use CI to test your metrics
If you've [built semantic nodes](/docs/build/build-metrics-intro) in your dbt project, you can [validate them in a CI job](/docs/deploy/ci-jobs#semantic-validations-in-ci) to ensure code changes made to dbt models don't break these metrics.
:::

### 3. Test your process

That's it! There are other steps you can take to be even more confident in your work, such as validating your structure follows best practices and linting your code. For more information, refer to [Get started with Continuous Integration tests](/guides/set-up-ci).
Expand Down Expand Up @@ -356,4 +360,4 @@ When the Release Manager is ready to cut a new release, they will manually open

To test your new flow, create a new branch in the dbt Cloud IDE then add a new file or modify an existing one. Commit it, then create a new Pull Request (not a draft) against your `qa` branch. You'll see the integration tests begin to run. Once they complete, manually create a PR against `main`, and within a few seconds you’ll see the tests run again but this time incorporating all changes from all code that hasn't been merged to main yet.

</div>
</div>
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit d1cf3e3

Please sign in to comment.