Skip to content

Commit

Permalink
Merge branch 'current' into manifest-v17
Browse files Browse the repository at this point in the history
  • Loading branch information
matthewshaver authored Oct 13, 2023
2 parents cfc6b4f + 6cccc16 commit 2ea693c
Show file tree
Hide file tree
Showing 18 changed files with 251 additions and 27 deletions.
4 changes: 4 additions & 0 deletions website/dbt-versions.js
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,10 @@ exports.versions = [
]

exports.versionedPages = [
{
"page": "reference/resource-configs/store_failures_as",
"firstVersion": "1.7",
},
{
"page": "docs/build/build-metrics-intro",
"firstVersion": "1.6",
Expand Down
7 changes: 7 additions & 0 deletions website/docs/docs/build/cumulative-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,13 @@ metrics:

```

## Limitations
Cumulative metrics are currently under active development and have the following limitations:

1. You can only use the [`metric_time` dimension](/docs/build/dimensions#time) to check cumulative metrics. If you don't use `metric_time` in the query, the cumulative metric will return incorrect results because it won't perform the time spine join. This means you cannot reference time dimensions other than the `metric_time` in the query.
2. If you use `metric_time` in your query filter but don't include "start_time" and "end_time," cumulative metrics will left-censor the input data. For example, if you query a cumulative metric with a 7-day window with the filter `{{ TimeDimension('metric_time') }} BETWEEN '2023-08-15' AND '2023-08-30' `, the values for `2023-08-15` to `2023-08-20` return missing or incomplete data. This is because we apply the `metric_time` filter to the aggregation input. To avoid this, you must use `start_time` and `end_time` in the query filter.


## Cumulative metrics example


Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/build/dimensions.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ dimensions:
To use BigQuery as your data platform, time dimensions columns need to be in the datetime data type. If they are stored in another type, you can cast them to datetime using the `expr` property. Time dimensions are used to group metrics by different levels of time, such as day, week, month, quarter, and year. MetricFlow supports these granularities, which can be specified using the `time_granularity` parameter.
:::

Time has additional parameters specified under the `type_params` section. When you query one or more metrics in MetricFlow using the CLI, the default time dimension for a single metric is the primary time dimension, which you can refer to as `metric_time` or use the dimensions' name.
Time has additional parameters specified under the `type_params` section. When you query one or more metrics in MetricFlow using the CLI, the default time dimension for a single metric is the aggregation time dimension, which you can refer to as `metric_time` or use the dimensions' name.

You can use multiple time groups in separate metrics. For example, the `users_created` metric uses `created_at`, and the `users_deleted` metric uses `deleted_at`:

Expand Down
67 changes: 67 additions & 0 deletions website/docs/docs/build/metricflow-time-spine.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ To create this table, you need to create a model in your dbt project called `met

<File name='metricflow_time_spine.sql'>

<VersionBlock lastVersion="1.6">

```sql
{{
config(
Expand All @@ -38,8 +40,44 @@ final as (

select * from final
```

</VersionBlock>

<VersionBlock firstVersion="1.7">

```sql
{{
config(
materialized = 'table',
)
}}

with days as (

{{
dbt.date_spine(
'day',
"to_date('01/01/2000','mm/dd/yyyy')",
"to_date('01/01/2027','mm/dd/yyyy')"
)
}}

),

final as (
select cast(date_day as date) as date_day
from days
)

select * from final
```

</VersionBlock>

</File>

<VersionBlock lastVersion="1.6">

```sql
-- filename: metricflow_time_spine.sql
-- BigQuery supports DATE() instead of TO_DATE(). Use this model if you're using BigQuery
Expand All @@ -61,4 +99,33 @@ final as (
select *
from final
```

</VersionBlock>

<VersionBlock firstVersion="1.7">

```sql
-- filename: metricflow_time_spine.sql
-- BigQuery supports DATE() instead of TO_DATE(). Use this model if you're using BigQuery
{{config(materialized='table')}}
with days as (
{{dbt.date_spine(
'day',
"DATE(2000,01,01)",
"DATE(2030,01,01)"
)
}}
),

final as (
select cast(date_day as date) as date_day
from days
)

select *
from final
```

</VersionBlock>

You only need to include the `date_day` column in the table. MetricFlow can handle broader levels of detail, but it doesn't currently support finer grains.
2 changes: 1 addition & 1 deletion website/docs/docs/build/tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@ where {{ column_name }} is null

## Storing test failures

Normally, a test query will calculate failures as part of its execution. If you set the optional `--store-failures` flag or [`store_failures` config](/reference/resource-configs/store_failures), dbt will first save the results of a test query to a table in the database, and then query that table to calculate the number of failures.
Normally, a test query will calculate failures as part of its execution. If you set the optional `--store-failures` flag, the [`store_failures`](/reference/resource-configs/store_failures), or the [`store_failures_as`](/reference/resource-configs/store_failures_as) configs, dbt will first save the results of a test query to a table in the database, and then query that table to calculate the number of failures.

This workflow allows you to query and examine failing records much more quickly in development:

Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/core/pip-install.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: "You can use pip to install dbt Core and adapter plugins from the c

You need to use `pip` to install dbt Core on Windows or Linux operating systems. You can use `pip` or [Homebrew](/docs/core/homebrew-install) for installing dbt Core on a MacOS.

You can install dbt Core and plugins using `pip` because they are Python modules distributed on [PyPI](https://pypi.org/project/dbt/).
You can install dbt Core and plugins using `pip` because they are Python modules distributed on [PyPI](https://pypi.org/project/dbt-core/).

<FAQ path="Core/install-pip-os-prereqs" />
<FAQ path="Core/install-python-compatibility" />
Expand Down
33 changes: 20 additions & 13 deletions website/docs/docs/dbt-cloud-apis/sl-jdbc.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ description: "Integrate and use the JDBC API to query your metrics."
tags: [Semantic Layer, API]
---


<VersionBlock lastVersion="1.5">

import LegacyInfo from '/snippets/_legacy-sl-callout.md';
Expand Down Expand Up @@ -59,11 +58,13 @@ jdbc:arrow-flight-sql://semantic-layer.cloud.getdbt.com:443?&environmentId=20233

## Querying the API for metric metadata

The Semantic Layer JDBC API has built-in metadata calls which can provide a user with information about their metrics and dimensions. Here are some metadata commands and examples:
The Semantic Layer JDBC API has built-in metadata calls which can provide a user with information about their metrics and dimensions.

Refer to the following tabs for metadata commands and examples:

<Tabs>

<TabItem value="allmetrics" label="Fetch all defined metrics">
<TabItem value="allmetrics" label="Fetch defined metrics">

Use this query to fetch all defined metrics in your dbt project:

Expand All @@ -74,7 +75,7 @@ select * from {{
```
</TabItem>
<TabItem value="alldimensions" label="Fetch all dimensions for a metric">
<TabItem value="alldimensions" label="Fetch dimensions for a metric">
Use this query to fetch all dimensions for a metric.
Expand All @@ -87,7 +88,7 @@ select * from {{
</TabItem>
<TabItem value="dimensionvalueformetrics" label="Fetch dimension values metrics">
<TabItem value="dimensionvalueformetrics" label="Fetch dimension values">
Use this query to fetch dimension values for one or multiple metrics and single dimension.
Expand All @@ -100,7 +101,7 @@ semantic_layer.dimension_values(metrics=['food_order_amount'], group_by=['custom
</TabItem>
<TabItem value="queryablegranularitiesformetrics" label="Fetch queryable primary time granularities for metrics">
<TabItem value="queryablegranularitiesformetrics" label="Fetch queryable granularities for metrics">
Use this query to fetch queryable granularities for a list of metrics. This API request allows you to only show the time granularities that make sense for the primary time dimension of the metrics (such as `metric_time`), but if you want queryable granularities for other time dimensions, you can use the `dimensions()` call, and find the column queryable_granularities.
Expand All @@ -113,6 +114,9 @@ select * from {{
</TabItem>
</Tabs>
<Tabs>
<TabItem value="metricsfordimensions" label="Fetch available metrics given dimensions">
Expand Down Expand Up @@ -144,9 +148,10 @@ select NAME, QUERYABLE_GRANULARITIES from {{
</TabItem>
<TabItem value="fetchprimarytimedimensionnames" label="Determine what time dimension(s) make up metric_time for your metric(s)">
<TabItem value="fetchprimarytimedimensionnames" label="Fetch primary time dimension names">
It may be useful in your application to expose the names of the time dimensions that represent `metric_time` or the common thread across all metrics.
You can first query the `metrics()` argument to fetch a list of measures, then use the `measures()` call which will return the name(s) of the time dimensions that make up metric time.
```bash
Expand All @@ -167,12 +172,13 @@ To query metric values, here are the following parameters that are available:
| `metrics` | The metric name as defined in your dbt metric configuration | `metrics=['revenue']` | Required |
| `group_by` | Dimension names or entities to group by. We require a reference to the entity of the dimension (other than for the primary time dimension), which is pre-appended to the front of the dimension name with a double underscore. | `group_by=['user__country', 'metric_time']` | Optional |
| `grain` | A parameter specific to any time dimension and changes the grain of the data from the default for the metric. | `group_by=[Dimension('metric_time')` <br/> `grain('week\|day\|month\|quarter\|year')]` | Optional |
| `where` | A where clause that allows you to filter on dimensions and entities using parameters - comes with `TimeDimension`, `Dimension`, and `Entity` objects. Granularity is required with `TimeDimension` | `"{{ where=Dimension('customer__country') }} = 'US')"` | Optional |
| `where` | A where clause that allows you to filter on dimensions and entities using parameters. This takes a filter list OR string. Inputs come with `Dimension`, and `Entity` objects. Granularity is required if the `Dimension` is a time dimension | `"{{ where=Dimension('customer__country') }} = 'US')"` | Optional |
| `limit` | Limit the data returned | `limit=10` | Optional |
|`order` | Order the data returned | `order_by=['-order_gross_profit']` (remove `-` for ascending order) | Optional |
| `compile` | If true, returns generated SQL for the data platform but does not execute | `compile=True` | Optional |
## Note on time dimensions and `metric_time`
You will notice that in the list of dimensions for all metrics, there is a dimension called `metric_time`. `Metric_time` is a reserved keyword for the measure-specific aggregation time dimensions. For any time-series metric, the `metric_time` keyword should always be available for use in queries. This is a common dimension across *all* metrics in a semantic graph.
Expand Down Expand Up @@ -246,13 +252,13 @@ select * from {{
Where filters in API allow for a filter list or string. We recommend using the filter list for production applications as this format will realize all benefits from the <Term id="predicate-pushdown" /> where possible.
Where filters have the following components that you can use:
Where Filters have a few objects that you can use:
- `Dimension()` - This is used for any categorical or time dimensions. If used for a time dimension, granularity is required - `Dimension('metric_time').grain('week')` or `Dimension('customer__country')`
- `TimeDimension()` - This is used for all time dimensions and requires a granularity argument - `TimeDimension('metric_time', 'MONTH)`
- `Entity()` - Used for entities like primary and foreign keys - `Entity('order_id')`
- `Entity()` - This is used for entities like primary and foreign keys - `Entity('order_id')`
Note: If you prefer a more explicit path to create the `where` clause, you can optionally use the `TimeDimension` feature. This helps separate out categorical dimensions from time-related ones. The `TimeDimesion` input takes the time dimension name and also requires granularity, like this: `TimeDimension('metric_time', 'MONTH')`.
Use the following example to query using a `where` filter with the string format:
Expand All @@ -261,7 +267,7 @@ Use the following example to query using a `where` filter with the string format
select * from {{
semantic_layer.query(metrics=['food_order_amount', 'order_gross_profit'],
group_by=[Dimension('metric_time').grain('month'),'customer__customer_type'],
where="{{ TimeDimension('metric_time', 'MONTH') }} >= '2017-03-09' AND {{ Dimension('customer__customer_type' }} in ('new') AND {{ Entity('order_id') }} = 10")
where="{{ Dimension('metric_time').grain('month') }} >= '2017-03-09' AND {{ Dimension('customer__customer_type' }} in ('new') AND {{ Entity('order_id') }} = 10")
}}
```
Expand All @@ -271,7 +277,7 @@ Use the following example to query using a `where` filter with a filter list for
select * from {{
semantic_layer.query(metrics=['food_order_amount', 'order_gross_profit'],
group_by=[Dimension('metric_time').grain('month'),'customer__customer_type'],
where=[{{ TimeDimension('metric_time', 'MONTH')}} >= '2017-03-09', {{ Dimension('customer__customer_type' }} in ('new'), {{ Entity('order_id') }} = 10])
where=[{{ Dimension('metric_time').grain('month') }} >= '2017-03-09', {{ Dimension('customer__customer_type' }} in ('new'), {{ Entity('order_id') }} = 10])
}}
```
Expand All @@ -287,6 +293,7 @@ semantic_layer.query(metrics=['food_order_amount', 'order_gross_profit'],
order_by=['order_gross_profit'])
}}
```
### Query with compile keyword
Use the following example to query using a `compile` keyword:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,11 @@ import AvailIntegrations from '/snippets/_sl-partner-links.md';

## Custom integration

- You can create custom integrations using different languages and tools. We support connecting with JDBC, ADBC, and a GraphQL APIs. For more info, check out [our examples on GitHub](https://github.com/dbt-labs/example-semantic-layer-clients/).
- You can create custom integrations using different languages and tools. We support connecting with JDBC, ADBC, and GraphQL APIs. For more info, check out [our examples on GitHub](https://github.com/dbt-labs/example-semantic-layer-clients/).
- You can also connect to tools that allow you to write SQL. These tools must meet one of the two criteria:

- Supports a generic JDBC driver option (such as DataGrip) or
- Supports Dremio and uses ArrowFlightSQL driver version 12.0.0 or higher.
- Uses Arrow Flight SQL JDBC driver version 12.0.0 or higher.

## Related docs

Expand Down
4 changes: 2 additions & 2 deletions website/docs/guides/migration/sl-migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,8 @@ This step is only relevant to users who want the legacy and new semantic layer t
1. Create a new deployment environment in dbt Cloud and set the dbt version to 1.6 or higher.
2. Choose `Only run on a custom branch` and point to the branch that has the updated metric definition
3. Set the deployment schema to a temporary migration schema, such as `tmp_sl_migration`. Optional, you can create a new database for the migration.
4. Create a job to parse your project, such as `dbt parse`, and run it. Make sure this job succeeds, There needs to be a successful job in your environment in order to set up the semantic layer
5. In Account Settings > Projects > Project details click `Configure the Semantic Layer`. Under **Environment**select the deployment environment you created in the previous step. Save your configuration.
4. Create a job to parse your project, such as `dbt parse`, and run it. Make sure this job succeeds, there needs to be a successful job in your environment in order to set up the semantic layer
5. In Account Settings > Projects > Project details click `Configure the Semantic Layer`. Under **Environment**, select the deployment environment you created in the previous step. Save your configuration.
6. In the Project details page, click `Generate service token` and grant it `Semantic Layer Only` and `Metadata Only` permissions. Save this token securely - you will need it to connect to the semantic layer.
At this point, both the new semantic layer and the old semantic layer will be running. The new semantic layer will be pointing at your migration branch with the updated metrics definitions.
Expand Down
22 changes: 17 additions & 5 deletions website/docs/guides/migration/versions/00-upgrading-to-v1.7.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,22 @@ description: New features and changes in dbt Core v1.7

dbt Labs is committed to providing backward compatibility for all versions 1.x, with the exception of any changes explicitly mentioned below. If you encounter an error upon upgrading, please let us know by [opening an issue](https://github.com/dbt-labs/dbt-core/issues/new).

### Behavior changes

dbt Core v1.7 expands the amount of sources you can configure freshness for. Previously, freshness was limited to sources with a `loaded_at_field`; now, freshness can be generated from warehouse metadata tables when available.

As part of this change, the `loaded_at_field` is no longer required to generate source freshness. If a source has a `freshness:` block, dbt will attempt to calculate freshness for that source:
- If a `loaded_at_field` is provided, dbt will calculate freshness via a select query (previous behavior).
- If a `loaded_at_field` is _not_ provided, dbt will calculate freshness via warehouse metadata tables when possible (new behavior).

This is a relatively small behavior change, but worth calling out in case you notice that dbt is calculating freshness for _more_ sources than before. To exclude a source from freshness calculations, you have two options:
- Don't add a `freshness:` block.
- Explicitly set `freshness: null`

## New and changed features and functionality

- [`dbt docs generate`](/reference/commands/cmd-docs) now supports `--select` to generate documentation for a subset of your project. Currently available for Snowflake and Postgres only, but other adapters are coming soon.
- [Source freshness](/docs/deploy/source-freshness) can now be generated from warehouse metadata tables, currently snowflake only, but other adapters that have metadata tables are coming soon. If you configure source freshness without a `loaded_at_field`, dbt will try to determine freshness from warehouse metadata tables.
- The nodes dictionary in the `catalog.json` can now be "partial" if `dbt docs generate` is run with a selector.
- [Source freshness](/docs/deploy/source-freshness) can now be generated from warehouse metadata tables, currently Snowflake only, but other adapters that have metadata tables are coming soon.

### MetricFlow enhancements

Expand All @@ -32,8 +43,9 @@ dbt Labs is committed to providing backward compatibility for all versions 1.x,

- The [manifest](/reference/artifacts/manifest-json) schema version has been updated to v11.
- The [run_results](/reference/artifacts/run-results-json) schema version has been updated to v5.
- Added [node attributes](/reference/artifacts/run-results-json) related to compilation (`compiled`, `compiled_code`, `relation_name`).

- There are a few specific changes to the [catalog.json](/reference/artifacts/catalog-json):
- Added [node attributes](/reference/artifacts/run-results-json) related to compilation (`compiled`, `compiled_code`, `relation_name`) to the `catalog.json`.
- The nodes dictionary in the `catalog.json` can now be "partial" if `dbt docs generate` is run with a selector.

### Model governance

Expand All @@ -49,4 +61,4 @@ dbt Core v1.5 introduced model governance which we're continuing to refine. v1.
With these quick hits, you can now:
- Configure a `delimiter` for a seed file.
- Use packages with the same git repo and unique subdirectory.
- Moved the `date_spine` macro from dbt-utils to dbt-core.
- Access the `date_spine` macro directly from dbt-core (moved over from dbt-utils).
Original file line number Diff line number Diff line change
Expand Up @@ -90,4 +90,5 @@ More consistency and flexibility around packages. Resources defined in a package
- [`dbt debug --connection`](/reference/commands/debug) to test just the data platform connection specified in a profile
- [`dbt docs generate --empty-catalog`](/reference/commands/cmd-docs) to skip catalog population while generating docs
- [`--defer-state`](/reference/node-selection/defer) enables more-granular control
- [`dbt ls`](/reference/commands/list) adds the Semantic model selection method to allow for `dbt ls -s "semantic_model:*"` and the ability to execute `dbt ls --resource-type semantic_model`.

Loading

0 comments on commit 2ea693c

Please sign in to comment.