diff --git a/website/docs/docs/build/dimensions.md b/website/docs/docs/build/dimensions.md index 975ae4d3160..25a1c729a7a 100644 --- a/website/docs/docs/build/dimensions.md +++ b/website/docs/docs/build/dimensions.md @@ -22,6 +22,7 @@ All dimensions require a `name`, `type`, and can optionally include an `expr` pa | `description` | A clear description of the dimension. | Optional | String | | `expr` | Defines the underlying column or SQL query for a dimension. If no `expr` is specified, MetricFlow will use the column with the same name as the group. You can use the column name itself to input a SQL expression. | Optional | String | | `label` | Defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Optional | String | +| [`meta`](/reference/resource-configs/meta) | Set metadata for a resource and organize resources. Accepts plain text, spaces, and quotes. | Optional | Dictionary | Refer to the following for the complete specification for dimensions: @@ -37,6 +38,8 @@ dimensions: Refer to the following example to see how dimensions are used in a semantic model: + + ```yaml semantic_models: - name: transactions @@ -59,6 +62,9 @@ semantic_models: type_params: time_granularity: day label: "Date of transaction" # Recommend adding a label to provide more context to users consuming the data + config: + meta: + data_owner: "Finance team" expr: ts - name: is_bulk type: categorical @@ -66,6 +72,40 @@ semantic_models: - name: type type: categorical ``` + + + + +```yaml +semantic_models: + - name: transactions + description: A record for every transaction that takes place. Carts are considered multiple transactions for each SKU. + model: {{ ref('fact_transactions') }} + defaults: + agg_time_dimension: order_date +# --- entities --- + entities: + - name: transaction + type: primary + ... +# --- measures --- + measures: + ... +# --- dimensions --- + dimensions: + - name: order_date + type: time + type_params: + time_granularity: day + label: "Date of transaction" # Recommend adding a label to provide more context to users consuming the data + expr: ts + - name: is_bulk + type: categorical + expr: case when quantity > 10 then true else false end + - name: type + type: categorical +``` + Dimensions are bound to the primary entity of the semantic model they are defined in. For example the dimension `type` is defined in a model that has `transaction` as a primary entity. `type` is scoped to the `transaction` entity, and to reference this dimension you would use the fully qualified dimension name i.e `transaction__type`. @@ -101,12 +141,28 @@ This section further explains the dimension definitions, along with examples. Di Categorical dimensions are used to group metrics by different attributes, features, or characteristics such as product type. They can refer to existing columns in your dbt model or be calculated using a SQL expression with the `expr` parameter. An example of a categorical dimension is `is_bulk_transaction`, which is a group created by applying a case statement to the underlying column `quantity`. This allows users to group or filter the data based on bulk transactions. + + +```yaml +dimensions: + - name: is_bulk_transaction + type: categorical + expr: case when quantity > 10 then true else false end + config: + meta: + usage: "Filter to identify bulk transactions, like where quantity > 10." +``` + + + + ```yaml dimensions: - name: is_bulk_transaction type: categorical expr: case when quantity > 10 then true else false end ``` + ## Time @@ -130,12 +186,17 @@ You can set `is_partition` for time to define specific time spans. Additionally, Use `is_partition: True` to show that a dimension exists over a specific time window. For example, a date-partitioned dimensional table. When you query metrics from different tables, the dbt Semantic Layer uses this parameter to ensure that the correct dimensional values are joined to measures. + + ```yaml dimensions: - name: created_at type: time label: "Date of creation" expr: ts_created # ts_created is the underlying column name from the table + config: + meta: + notes: "Only valid for orders from 2022 onward" is_partition: True type_params: time_granularity: day @@ -156,6 +217,37 @@ measures: expr: 1 agg: sum ``` + + + + +```yaml +dimensions: + - name: created_at + type: time + label: "Date of creation" + expr: ts_created # ts_created is the underlying column name from the table + is_partition: True + type_params: + time_granularity: day + - name: deleted_at + type: time + label: "Date of deletion" + expr: ts_deleted # ts_deleted is the underlying column name from the table + is_partition: True + type_params: + time_granularity: day + +measures: + - name: users_deleted + expr: 1 + agg: sum + agg_time_dimension: deleted_at + - name: users_created + expr: 1 + agg: sum +``` + diff --git a/website/docs/docs/build/entities.md b/website/docs/docs/build/entities.md index e4ed0773c3c..558dfd3aea4 100644 --- a/website/docs/docs/build/entities.md +++ b/website/docs/docs/build/entities.md @@ -95,17 +95,67 @@ Natural keys are columns or combinations of columns in a table that uniquely ide The following is the complete spec for entities: + + +```yaml +semantic_models: + - name: semantic_model_name + ..rest of the semantic model config + entities: + - name: entity_name ## Required + type: Primary, natural, foreign, or unique ## Required + description: A description of the field or role the entity takes in this table ## Optional + expr: The field that denotes that entity (transaction_id). ## Optional + Defaults to name if unspecified. + [config](/reference/resource-properties/config): Specify configurations for entity. ## Optional + [meta](/reference/resource-configs/meta): {} Set metadata for a resource and organize resources. Accepts plain text, spaces, and quotes. ## Optional +``` + + + + +```yaml +semantic_models: + - name: semantic_model_name + ..rest of the semantic model config + entities: + - name: entity_name ## Required + type: Primary, or natural, or foreign, or unique ## Required + description: A description of the field or role the entity takes in this table ## Optional + expr: The field that denotes that entity (transaction_id). ## Optional + Defaults to name if unspecified. +``` + + +Here's an example of how to define entities in a semantic model: + + + ```yaml entities: - - name: transaction ## Required - type: Primary or natural or foreign or unique ## Required + - name: transaction + type: primary + expr: id_transaction + - name: order + type: foreign + expr: id_order + - name: user + type: foreign + expr: substring(id_order from 2) + entities: + - name: transaction + type: description: A description of the field or role the entity takes in this table ## Optional - expr: The field that denotes that entity (transaction_id). ## Optional + expr: The field that denotes that entity (transaction_id). Defaults to name if unspecified. + [config](/reference/resource-properties/config): + [meta](/reference/resource-configs/meta): + data_owner: "Finance team" ``` + + + -Here's an example of how to define entities in a semantic model: - ```yaml entities: - name: transaction @@ -117,11 +167,18 @@ entities: - name: user type: foreign expr: substring(id_order from 2) + entities: + - name: transaction + type: + description: A description of the field or role the entity takes in this table ## Optional + expr: The field that denotes that entity (transaction_id). + Defaults to name if unspecified. ``` + ## Combine columns with a key -If a table doesn't have any key (like a primary key), use _surrogate combination_ to form a key that will help you identify a record by combining two columns. This applies to any [entity type](/docs/build/entities#entity-types). For example, you can combine `date_key` and `brand_code` from the `raw_brand_target_weekly` table to form a _surrogate key_. The following example creates a surrogate key by joining `date_key` and `brand_code` using a pipe (`|`) as a separator. +If a table doesn't have any key (like a primary key), use _surrogate combination_ to form a key that will help you identify a record by combining two columns. This applies to any [entity type](/docs/build/entities#entity-types). For example, you can combine `date_key` and `brand_code` from the `raw_brand_target_weekly` table to form a _surrogate key_. The following example creates a surrogate key by joining `date_key` and `brand_code` using a pipe (`|`) as a separator. ```yaml diff --git a/website/docs/docs/build/measures.md b/website/docs/docs/build/measures.md index d60aa3f7e21..aa66dc86731 100644 --- a/website/docs/docs/build/measures.md +++ b/website/docs/docs/build/measures.md @@ -18,16 +18,41 @@ import MeasuresParameters from '/snippets/_sl-measures-parameters.md'; An example of the complete YAML measures spec is below. The actual configuration of your measures will depend on the aggregation you're using. + + +```yaml +semantic_models: + - name: semantic_model_name + ..rest of the semantic model config + measures: + - name: The name of the measure + description: 'same as always' ## Optional + agg: the aggregation type. + expr: the field + agg_params: 'specific aggregation properties such as a percentile' ## Optional + agg_time_dimension: The time field. Defaults to the default agg time dimension for the semantic model. ## Optional + non_additive_dimension: 'Use these configs when you need non-additive dimensions.' ## Optional + [config](/reference/resource-properties/config): Use the config property to specify configurations for your measure. ## Optional + [meta](/reference/resource-configs/meta): {} Set metadata for a resource and organize resources. Accepts plain text, spaces, and quotes. ## Optional +``` + + + + ```yaml -measures: - - name: The name of the measure - description: 'same as always' ## Optional - agg: the aggregation type. - expr: the field - agg_params: 'specific aggregation properties such as a percentile' ## Optional - agg_time_dimension: The time field. Defaults to the default agg time dimension for the semantic model. ## Optional - non_additive_dimension: 'Use these configs when you need non-additive dimensions.' ## Optional +semantic_models: + - name: semantic_model_name + ..rest of the semantic model config + measures: + - name: The name of the measure + description: 'same as always' ## Optional + agg: the aggregation type. + expr: the field + agg_params: 'specific aggregation properties such as a percentile' ## Optional + agg_time_dimension: The time field. Defaults to the default agg time dimension for the semantic model. ## Optional + non_additive_dimension: 'Use these configs when you need non-additive dimensions.' ## Optional ``` + ### Name @@ -96,6 +121,96 @@ If you use the `dayofweek` function in the `expr` parameter with the legacy Snow ### Model with different aggregations + + +```yaml +semantic_models: + - name: transactions + description: A record of every transaction that takes place. Carts are considered multiple transactions for each SKU. + model: ref('schema.transactions') + defaults: + agg_time_dimension: transaction_date + +# --- entities --- + entities: + - name: transaction_id + type: primary + - name: customer_id + type: foreign + - name: store_id + type: foreign + - name: product_id + type: foreign + +# --- measures --- + measures: + - name: transaction_amount_usd + description: Total USD value of transactions + expr: transaction_amount_usd + agg: sum + config: + meta: + used_in_reporting: true + - name: transaction_amount_usd_avg + description: Average USD value of transactions + expr: transaction_amount_usd + agg: average + - name: transaction_amount_usd_max + description: Maximum USD value of transactions + expr: transaction_amount_usd + agg: max + - name: transaction_amount_usd_min + description: Minimum USD value of transactions + expr: transaction_amount_usd + agg: min + - name: quick_buy_transactions + description: The total transactions bought as quick buy + expr: quick_buy_flag + agg: sum_boolean + - name: distinct_transactions_count + description: Distinct count of transactions + expr: transaction_id + agg: count_distinct + - name: transaction_amount_avg + description: The average value of transactions + expr: transaction_amount_usd + agg: average + - name: transactions_amount_usd_valid # Notice here how we use expr to compute the aggregation based on a condition + description: The total USD value of valid transactions only + expr: CASE WHEN is_valid = True then transaction_amount_usd else 0 end + agg: sum + - name: transactions + description: The average value of transactions. + expr: transaction_amount_usd + agg: average + - name: p99_transaction_value + description: The 99th percentile transaction value + expr: transaction_amount_usd + agg: percentile + agg_params: + percentile: .99 + use_discrete_percentile: False # False calculates the continuous percentile, True calculates the discrete percentile. + - name: median_transaction_value + description: The median transaction value + expr: transaction_amount_usd + agg: median + +# --- dimensions --- + dimensions: + - name: transaction_date + type: time + expr: date_trunc('day', ts) # expr refers to underlying column ts + type_params: + time_granularity: day + - name: is_bulk_transaction + type: categorical + expr: case when quantity > 10 then true else false end + +``` + + + + ```yaml semantic_models: - name: transactions @@ -177,6 +292,7 @@ semantic_models: expr: case when quantity > 10 then true else false end ``` + ### Non-additive dimensions diff --git a/website/docs/docs/build/simple.md b/website/docs/docs/build/simple.md index 2deb718d780..19dd4bb0086 100644 --- a/website/docs/docs/build/simple.md +++ b/website/docs/docs/build/simple.md @@ -15,6 +15,7 @@ Simple metrics are metrics that directly reference a single measure, without any Note that we use the double colon (::) to indicate whether a parameter is nested within another parameter. So for example, `query_params::metrics` means the `metrics` parameter is nested under `query_params`. ::: + | Parameter | Description | Required | Type | | --------- | ----------- | ---- | ---- | | `name` | The name of the metric. | Required | String | diff --git a/website/docs/docs/dbt-versions/core-upgrade/08-upgrading-to-v1.7.md b/website/docs/docs/dbt-versions/core-upgrade/08-upgrading-to-v1.7.md index df24b63a2f0..b98a76295cf 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/08-upgrading-to-v1.7.md +++ b/website/docs/docs/dbt-versions/core-upgrade/08-upgrading-to-v1.7.md @@ -66,7 +66,7 @@ dbt Core v1.5 introduced model governance which we're continuing to refine. v1. ### dbt clean -Starting in v1.7, `dbt clean` will only clean paths within the current working directory. The `--no-clean-project-files-only` flag will delete all paths specified in `clean-paths`, even if they're outside the dbt project. +Starting in v1.7, `dbt clean` will only clean paths within the current working directory. The `--no-clean-project-files-only` flag will delete all paths specified in the `clean-targets` section of `dbt_project.yml`, even if they're outside the dbt project. Supported flags: - `--clean-project-files-only` (default) diff --git a/website/docs/docs/dbt-versions/release-notes.md b/website/docs/docs/dbt-versions/release-notes.md index 9b2205e46d8..7af1db884f6 100644 --- a/website/docs/docs/dbt-versions/release-notes.md +++ b/website/docs/docs/dbt-versions/release-notes.md @@ -20,6 +20,7 @@ Release notes are grouped by month for both multi-tenant and virtual private clo ## December 2024 +- **New**: [Dimensions](/reference/resource-configs/meta) now support the `meta` config property in [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and from dbt Core 1.9. You can add metadata to your dimensions to provide additional context and information about the dimension. Refer to [meta](/reference/resource-configs/meta) for more information. - **New**: [Auto exposures](/docs/collaborate/auto-exposures) are now generally available to dbt Cloud Enterprise plans. Auto-exposures integrate natively with Tableau (Power BI coming soon) and auto-generate downstream lineage in dbt Explorer for a richer experience. - **New**: The dbt Semantic Layer supports Sigma as a [partner integration](/docs/cloud-integrations/avail-sl-integrations), available in Preview. Refer to [Sigma](https://help.sigmacomputing.com/docs/configure-a-dbt-semantic-layer-integration) for more information. - **New**: The dbt Semantic Layer now supports Azure Single-tenant deployments. Refer to [Set up the dbt Semantic Layer](/docs/use-dbt-semantic-layer/setup-sl) for more information on how to get started. @@ -29,7 +30,6 @@ Release notes are grouped by month for both multi-tenant and virtual private clo - **New**: You can now use your [Azure OpenAI key](/docs/cloud/account-integrations?ai-integration=azure#ai-integrations) (available in beta) to use dbt Cloud features like [dbt Copilot](/docs/cloud/dbt-copilot) and [Ask dbt](/docs/cloud-integrations/snowflake-native-app) . Additionally, you can use your own [OpenAI API key](/docs/cloud/account-integrations?ai-integration=openai#ai-integrations) or use [dbt Labs-managed OpenAI](/docs/cloud/account-integrations?ai-integration=dbtlabs#ai-integrations) key. Refer to [AI integrations](/docs/cloud/account-integrations#ai-integrations) for more information. - **New**: The [`hard_deletes`](/reference/resource-configs/hard-deletes) config gives you more control on how to handle deleted rows from the source. Supported options are `ignore` (default), `invalidate` (replaces the legacy `invalidate_hard_deletes=true`), and `new_record`. Note that `new_record` will create a new metadata column in the snapshot table. - ## November 2024 - **Enhancement**: Data health signals in dbt Explorer are now available for Exposures, providing a quick view of data health while browsing resources. To view trust signal icons, go to dbt Explorer and click **Exposures** under the **Resource** tab. Refer to [Data health signals for resources](/docs/collaborate/data-health-signals) for more info. - **Bug**: Identified and fixed an error with Semantic Layer queries that take longer than 10 minutes to complete. diff --git a/website/docs/reference/resource-configs/meta.md b/website/docs/reference/resource-configs/meta.md index e1542bdbc82..a7f348d50ba 100644 --- a/website/docs/reference/resource-configs/meta.md +++ b/website/docs/reference/resource-configs/meta.md @@ -16,7 +16,7 @@ hide_table_of_contents: true { label: 'Analyses', value: 'analyses', }, { label: 'Macros', value: 'macros', }, { label: 'Exposures', value: 'exposures', }, - { label: 'Semantic Models', value: 'semantic models', }, + { label: 'Semantic models', value: 'semantic models', }, { label: 'Metrics', value: 'metrics', }, { label: 'Saved queries', value: 'saved queries', }, ] @@ -179,6 +179,27 @@ exposures: +Configure `meta` in the your [semantic models](/docs/build/semantic-models) YAML file or under the `semantic-models` config block in the `dbt_project.yml` file. + + + + + +```yml +semantic_models: + - name: semantic_model_name + config: + meta: {} + +``` + + + + + + +[Dimensions](/docs/build/dimensions), [entities](/docs/build/entities), and [measures](/docs/build/measures) can also have their own `meta` configurations. + ```yml @@ -187,9 +208,25 @@ semantic_models: config: meta: {} + dimensions: + - name: dimension_name + config: + meta: {} + + entities: + - name: entity_name + config: + meta: {} + + measures: + - name: measure_name + config: + meta: {} + ``` + The `meta` config can also be defined under the `semantic-models` config block in `dbt_project.yml`. See [configs and properties](/reference/configs-and-properties) for details. @@ -249,13 +286,11 @@ saved_queries: ``` - - ## Definition -The `meta` field can be used to set metadata for a resource. This metadata is compiled into the `manifest.json` file generated by dbt, and is viewable in the auto-generated documentation. +The `meta` field can be used to set metadata for a resource and accepts any key-value pairs. This metadata is compiled into the `manifest.json` file generated by dbt, and is viewable in the auto-generated documentation. Depending on the resource you're configuring, `meta` may be available within the `config` property, and/or as a top-level key. (For backwards compatibility, `meta` is often (but not always) supported as a top-level key, though without the capabilities of config inheritance.) @@ -343,3 +378,107 @@ models: +### Assign meta to semantic model + + +The following example shows how to assign a `meta` value to a [semantic model](/docs/build/semantic-models) in the `semantic_model.yml` file and `dbt_project.yml` file: + + + + +```yaml +semantic_models: + - name: transaction + model: ref('fact_transactions') + description: "Transaction fact table at the transaction level. This table contains one row per transaction and includes the transaction timestamp." + defaults: + agg_time_dimension: transaction_date + config: + meta: + data_owner: "Finance team" + used_in_reporting: true +``` + + + + + +```yaml +semantic-models: + jaffle_shop: + +meta: + used_in_reporting: true +``` + + + +### Assign meta to dimensions, measures, entities + + + +Available in dbt version 1.9 and later. + + + + + + + + +The following example shows how to assign a `meta` value to a [dimension](/docs/build/dimensions), [entity](/docs/build/entities), and [measure](/docs/build/measures) in a semantic model: + + + +```yml +semantic_models: + - name: semantic_model + ... + dimensions: + - name: order_date + type: time + config: + meta: + data_owner: "Finance team" + used_in_reporting: true + entities: + - name: customer_id + type: primary + config: + meta: + description: "Unique identifier for customers" + data_owner: "Sales team" + used_in_reporting: false + measures: + - name: count_of_users + expr: user_id + config: + meta: + used_in_reporting: true +``` + + + + + + +This second example shows how to assign a `data_owner` and additional metadata value to a dimension in the `dbt_project.yml` file using the `+meta` syntax. The similar syntax can be used for entities and measures. + + + +```yml +semantic-models: + jaffle_shop: + ... + [dimensions](/docs/build/dimensions): + - name: order_date + config: + meta: + data_owner: "Finance team" + used_in_reporting: true +``` + + + + + + diff --git a/website/snippets/_sl-measures-parameters.md b/website/snippets/_sl-measures-parameters.md index 8d6b84a71dd..f80f90c3063 100644 --- a/website/snippets/_sl-measures-parameters.md +++ b/website/snippets/_sl-measures-parameters.md @@ -1,3 +1,4 @@ + | Parameter | Description | Required | Type | | --- | --- | --- | --- | | [`name`](/docs/build/measures#name) | Provide a name for the measure, which must be unique and can't be repeated across all semantic models in your dbt project. | Required | String | @@ -9,3 +10,4 @@ | `agg_time_dimension` | The time field. Defaults to the default agg time dimension for the semantic model. | Optional | String | | `label` | String that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). Available in dbt version 1.7 or higher. | Optional | String | | `create_metric` | Create a `simple` metric from a measure by setting `create_metric: True`. The `label` and `description` attributes will be automatically propagated to the created metric. Available in dbt version 1.7 or higher. | Optional | Boolean | +| `config` | Use the [`config`](/reference/resource-properties/config) property to specify configurations for your metric. Supports the [`meta`](/reference/resource-configs/meta) property, nested under `config`. | Optional |