diff --git a/website/blog/2022-04-14-add-ci-cd-to-bitbucket.md b/website/blog/2022-04-14-add-ci-cd-to-bitbucket.md index 44346e93741..e871687d8cd 100644 --- a/website/blog/2022-04-14-add-ci-cd-to-bitbucket.md +++ b/website/blog/2022-04-14-add-ci-cd-to-bitbucket.md @@ -1,5 +1,5 @@ --- -title: "Slim CI/CD with Bitbucket Pipelines" +title: "Slim CI/CD with Bitbucket Pipelines for dbt Core" description: "How to set up slim CI/CD outside of dbt Cloud" slug: slim-ci-cd-with-bitbucket-pipelines @@ -10,8 +10,15 @@ hide_table_of_contents: false date: 2022-05-06 is_featured: true +keywords: + - dbt core pipeline, slim ci pipeline, slim cd pipeline, bitbucket --- + +:::info Set up CI/CD with dbt Cloud +This blog is specifically tailored for dbt Core users. If you're using dbt Cloud and your Git provider doesn't have a native dbt Cloud integration (like BitBucket), follow the [Customizing CI/CD with custom pipelines guide](/guides/custom-cicd-pipelines?step=3) to set up CI/CD. +::: + Continuous Integration (CI) sets the system up to test everyone’s pull request before merging. Continuous Deployment (CD) deploys each approved change to production. “Slim CI” refers to running/testing only the changed code, [thereby saving compute](https://discourse.getdbt.com/t/how-we-sped-up-our-ci-runs-by-10x-using-slim-ci/2603). In summary, CI/CD automates dbt pipeline testing and deployment. [dbt Cloud](https://www.getdbt.com/), a much beloved method of dbt deployment, [supports GitHub- and Gitlab-based CI/CD](https://blog.getdbt.com/adopting-ci-cd-with-dbt-cloud/) out of the box. It doesn’t support Bitbucket, AWS CodeCommit/CodeDeploy, or any number of other services, but you need not give up hope even if you are tethered to an unsupported platform. diff --git a/website/dbt-versions.js b/website/dbt-versions.js index e5a2b9f4290..871c3ce601e 100644 --- a/website/dbt-versions.js +++ b/website/dbt-versions.js @@ -10,16 +10,14 @@ * @property {string} EOLDate "End of Life" date which is used to show the EOL banner * @property {boolean} isPrerelease Boolean used for showing the prerelease banner * @property {string} customDisplay Allows setting a custom display name for the current version + * + * customDisplay for dbt Cloud should be a version ahead of latest dbt Core release (GA or beta). */ exports.versions = [ { version: "1.9.1", customDisplay: "Cloud (Versionless)", }, - { - version: "1.9", - isPrerelease: true, - }, { version: "1.8", EOLDate: "2025-04-15", diff --git a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-7-semantic-structure.md b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-7-semantic-structure.md index 295d86e9c20..5bfbea82dda 100644 --- a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-7-semantic-structure.md +++ b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-7-semantic-structure.md @@ -20,6 +20,10 @@ The first thing you need to establish is how you’re going to consistently stru It’s not terribly difficult to shift between these (it can be done with some relatively straightforward shell scripting), and this is purely a decision based on your developers’ preference (i.e. it has no impact on execution or performance), so don’t feel locked in to either path. Just pick the one that feels right and you can always shift down the road if you change your mind. +:::tip +Make sure to save all semantic models and metrics under the directory defined in the [`model-paths`](/reference/project-configs/model-paths) (or a subdirectory of it, like `models/semantic_models/`). If you save them outside of this path, it will result in an empty `semantic_manifest.json` file, and your semantic models or metrics won't be recognized. +::: + ## Naming Next, establish your system for consistent file naming: diff --git a/website/docs/best-practices/how-we-structure/5-the-rest-of-the-project.md b/website/docs/best-practices/how-we-structure/5-the-rest-of-the-project.md index 2dca148a226..c7522bf12eb 100644 --- a/website/docs/best-practices/how-we-structure/5-the-rest-of-the-project.md +++ b/website/docs/best-practices/how-we-structure/5-the-rest-of-the-project.md @@ -102,11 +102,11 @@ We’ve focused heavily thus far on the primary area of action in our dbt projec ### Project splitting -One important, growing consideration in the analytics engineering ecosystem is how and when to split a codebase into multiple dbt projects. Our present stance on this for most projects, particularly for teams starting out, is straightforward: you should avoid it unless you have no other option or it saves you from an even more complex workaround. If you do have the need to split up your project, it’s completely possible through the use of private packages, but the added complexity and separation is, for most organizations, a hindrance not a help, at present. That said, this is very likely subject to change! [We want to create a world where it’s easy to bring lots of dbt projects together into a cohesive lineage](https://github.com/dbt-labs/dbt-core/discussions/5244). In a world where it’s simple to break up monolithic dbt projects into multiple connected projects, perhaps inside of a modern monorepo, the calculus will be different, and the below situations we recommend against may become totally viable. So watch this space! +One important, growing consideration in the analytics engineering ecosystem is how and when to split a codebase into multiple dbt projects. Our present stance on this for most projects, particularly for teams starting out, is straightforward: you should avoid it unless you have no other option or it saves you from an even more complex workaround. If you do have the need to split up your project, it’s completely possible through the use of private packages, but the added complexity and separation is, for most organizations, a hindrance, not a help, at present. That said, this is very likely subject to change! [We want to create a world where it’s easy to bring lots of dbt projects together into a cohesive lineage](https://github.com/dbt-labs/dbt-core/discussions/5244). In a world where it’s simple to break up monolithic dbt projects into multiple connected projects, perhaps inside of a modern mono repo, the calculus will be different, and the below situations we recommend against may become totally viable. So watch this space! -- ❌ **Business groups or departments.** Conceptual separations within the project are not a good reason to split up your project. Splitting up, for instance, marketing and finance modeling into separate projects will not only add unnecessary complexity, but destroy the unifying effect of collaborating across your organization on cohesive definitions and business logic. -- ❌ **ML vs Reporting use cases.** Similarly to the point above, splitting a project up based on different use cases, particularly more standard BI versus ML features, is a common idea. We tend to discourage it for the time being. As with the previous point, a foundational goal of implementing dbt is to create a single source of truth in your organization. The features you’re providing to your data science teams should be coming from the same marts and metrics that serve reports on executive dashboards. There are a growing number of tools like [fal](https://blog.fal.ai/introducing-fal-dbt/) and [Continual.ai](http://Continual.ai) that make excellent use of this unified viewpoint. -- ✅ **Data governance.** Structural, organizational needs — such as data governance and security — are one of the few worthwhile reasons to split up a project. If, for instance, you work at a healthcare company with only a small team cleared to access raw data with PII in it, you may need to split out your staging models into their own project to preserve those policies. In that case, you would import your staging project into the project that builds on those staging models as a [private package](https://docs.getdbt.com/docs/build/packages/#private-packages). +- ❌ **Business groups or departments.** Conceptual separations within the project are not a good reason to split up your project. Splitting up, for instance, marketing and finance modeling into separate projects will not only add unnecessary complexity but destroy the unifying effect of collaborating across your organization on cohesive definitions and business logic. +- ❌ **ML vs Reporting use cases.** Similarly to the point above, splitting a project up based on different use cases, particularly more standard BI versus ML features, is a common idea. We tend to discourage it for the time being. As with the previous point, a foundational goal of implementing dbt is to create a single source of truth in your organization. The features you’re providing to your data science teams should be coming from the same marts and metrics that serve reports on executive dashboards. +- ✅ **Data governance.** Structural, organizational needs — such as data governance and security — are one of the few worthwhile reasons to split up a project. If, for instance, you work at a healthcare company with only a small team cleared to access raw data with PII in it, you may need to split out your staging models into their own projects to preserve those policies. In that case, you would import your staging project into the project that builds on those staging models as a [private package](https://docs.getdbt.com/docs/build/packages/#private-packages). - ✅ **Project size.** At a certain point, your project may grow to have simply too many models to present a viable development experience. If you have 1000s of models, it absolutely makes sense to find a way to split up your project. ## Final considerations diff --git a/website/docs/community/resources/getting-help.md b/website/docs/community/resources/getting-help.md index 19b7c22fbdf..e8dba3ef918 100644 --- a/website/docs/community/resources/getting-help.md +++ b/website/docs/community/resources/getting-help.md @@ -55,9 +55,9 @@ If you need dedicated support to build your dbt project, consider reaching out r If you want to receive dbt training, check out our [dbt Learn](https://learn.getdbt.com/) program. ## dbt Cloud support -**Note:** If you are a **dbt Cloud user** and need help with one of the following issues, please reach out to us by using the speech bubble (💬) in the dbt Cloud interface or at support@getdbt.com +**Note:** If you are a **dbt Cloud user** and need help with one of the following issues, please reach out to us by clicking **Create a support ticket** through the dbt Cloud navigation or emailing support@getdbt.com: - Account setup (e.g. connection issues, repo connections) - Billing - Bug reports related to the web interface -As a rule of thumb, if you are using dbt Cloud, but your problem is related to code within your dbt project, then please follow the above process rather than reaching out to support. Refer to [dbt Cloud support](/docs/dbt-support) for more information. +As a rule of thumb, if you are using dbt Cloud, but your problem is related to code within your dbt project, then please follow the above process or checking out the [FAQs](/docs/faqs) rather than reaching out to support. Refer to [dbt Cloud support](/docs/dbt-support) for more information. diff --git a/website/docs/docs/build/cumulative-metrics.md b/website/docs/docs/build/cumulative-metrics.md index aa2b85aa9c8..056ff79c6eb 100644 --- a/website/docs/docs/build/cumulative-metrics.md +++ b/website/docs/docs/build/cumulative-metrics.md @@ -16,6 +16,8 @@ Note that we use the double colon (::) to indicate whether a parameter is nested ## Parameters + + | Parameter |
Description
| Type | | --------- | ----------- | ---- | | `name` | The name of the metric. | Required | @@ -32,11 +34,33 @@ Note that we use the double colon (::) to indicate whether a parameter is nested | `measure::fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional | | `measure::join_to_timespine` | Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional | +
+ + + +| Parameter |
Description
| Type | +| --------- | ----------- | ---- | +| `name` | The name of the metric. | Required | +| `description` | The description of the metric. | Optional | +| `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required | +| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | +| `type_params` | The type parameters of the metric. Supports nested parameters indicated by the double colon, such as `type_params::measure`. | Required | +| `window` | The accumulation window, such as 1 month, 7 days, 1 year. This can't be used with `grain_to_date`. | Optional | +| `grain_to_date` | Sets the accumulation grain, such as `month`, which will accumulate data for one month and then restart at the beginning of the next. This can't be used with `window`. | Optional | +| `type_params::measure` | A list of measure inputs | Required | +| `measure:name` | The measure you are referencing. | Optional | +| `measure:fill_nulls_with` | Set the value in your metric definition instead of null (such as zero).| Optional | +| `measure:join_to_timespine` | Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional | + +
+ ### Complete specification The following displays the complete specification for cumulative metrics, along with an example: + + ```yaml metrics: - name: The metric name # Required @@ -54,13 +78,35 @@ metrics: join_to_timespine: true/false # Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. # Optional ``` + + + + +```yaml +metrics: + - name: The metric name # Required + description: The metric description # Optional + type: cumulative # Required + label: The value that will be displayed in downstream tools # Required + type_params: # Required + measure: + name: The measure you are referencing # Required + fill_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional + join_to_timespine: false # Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. # Optional + window: 1 month # The accumulation window, such as 1 month, 7 days, 1 year. Optional. Cannot be used with grain_to_date. + grain_to_date: month # Sets the accumulation grain, such as month will accumulate data for one month, then restart at the beginning of the next. Optional. Cannot be used with window. +``` + + ## Cumulative metrics example Cumulative metrics measure data over a given window and consider the window infinite when no window parameter is passed, accumulating the data over all time. -The following example shows how to define cumulative metrics in a YAML file. In this example, we define three cumulative metrics: +The following example shows how to define cumulative metrics in a YAML file: + + - `cumulative_order_total`: Calculates the cumulative order total over all time. Uses `type params` to specify the measure `order_total` to be aggregated. @@ -68,10 +114,23 @@ The following example shows how to define cumulative metrics in a YAML file. In - `cumulative_order_total_mtd`: Calculates the month-to-date cumulative order total, respectively. Uses `cumulative_type_params` to specify a `grain_to_date` of `month`. + + + + +- `cumulative_order_total`: Calculates the cumulative order total over all time. Uses `type params` to specify the measure `order_total` to be aggregated. + +- `cumulative_order_total_l1m`: Calculates the trailing 1-month cumulative order total. Uses `type params` to specify a `window` of 1 month. + +- `cumulative_order_total_mtd`: Calculates the month-to-date cumulative order total, respectively. Uses `type params` to specify a `grain_to_date` of `month`. + + + -```yaml + +```yaml metrics: - name: cumulative_order_total label: Cumulative order total (All-Time) @@ -101,8 +160,44 @@ metrics: cumulative_type_params: grain_to_date: month ``` + + + + +```yaml +metrics: + - name: cumulative_order_total + label: Cumulative order total (All-Time) + description: The cumulative value of all orders + type: cumulative + type_params: + measure: + name: order_total + + - name: cumulative_order_total_l1m + label: Cumulative order total (L1M) + description: Trailing 1-month cumulative order total + type: cumulative + type_params: + measure: + name: order_total + window: 1 month + + - name: cumulative_order_total_mtd + label: Cumulative order total (MTD) + description: The month-to-date value of all orders + type: cumulative + type_params: + measure: + name: order_total + grain_to_date: month +``` + + + + ### Granularity options Use the `period_agg` parameter with `first()`, `last()`, and `average()` functions to aggregate cumulative metrics over the requested period. This is because granularity options for cumulative metrics are different than the options for other metric types. @@ -192,6 +287,8 @@ group by + + ### Window options This section details examples of when to specify and not to specify window options. @@ -218,6 +315,8 @@ measures: We can write a cumulative metric `weekly_customers` as such: + + ``` yaml @@ -240,6 +339,31 @@ From the sample YAML example, note the following: For example, in the `weekly_customers` cumulative metric, MetricFlow takes a sliding 7-day window of relevant customers and applies a count distinct function. +If you remove `window`, the measure will accumulate over all time. + + + + + + +``` yaml +metrics: + - name: weekly_customers # Define the measure and the window. + type: cumulative + type_params: + measure: customers + window: 7 days # Setting the window to 7 days since we want to track weekly active +``` + + + +From the sample YAML example, note the following: + +* `type`: Specify cumulative to indicate the type of metric. +* `type_params`: Configure the cumulative metric by providing a `measure` and optionally add a `window` or `grain_to_date` configuration. + +For example, in the `weekly_customers` cumulative metric, MetricFlow takes a sliding 7-day window of relevant customers and applies a count distinct function. + If you remove `window`, the measure will accumulate over all time. @@ -286,7 +410,6 @@ metrics: ``` - ### Grain to date @@ -310,6 +433,8 @@ We can compare the difference between a 1-month window and a monthly grain to da + + ```yaml metrics: - name: cumulative_order_total_l1m # For this metric, we use a window of 1 month @@ -330,10 +455,33 @@ metrics: grain_to_date: month # Resets at the beginning of each month period_agg: first # Optional. Defaults to first. Accepted values: first|last|average ``` + + + + +```yaml +metrics: + - name: cumulative_order_total_l1m # For this metric, we use a window of 1 month + label: Cumulative order total (L1M) + description: Trailing 1-month cumulative order amount + type: cumulative + type_params: + measure: order_total + window: 1 month # Applies a sliding window of 1 month + - name: cumulative_order_total_mtd # For this metric, we use a monthly grain-to-date + label: Cumulative order total (MTD) + description: The month-to-date value of all orders + type: cumulative + type_params: + measure: order_total + grain_to_date: month # Resets at the beginning of each month +``` + Cumulative metric with grain to date: + ```yaml @@ -390,10 +538,25 @@ order by ``` + + + + + +```yaml +- name: orders_last_month_to_date + label: Orders month to date + type: cumulative + type_params: + measure: order_count + grain_to_date: month +``` + + ## SQL implementation example -To calculate the cumulative value of the metric over a given window, join the timespine table using the primary time dimension. Use the accumulation window in the join to decide which days to include in the calculation. +To calculate the cumulative value of the metric over a given window we do a time range join to a timespine table using the primary time dimension as the join key. We use the accumulation window in the join to decide whether a record should be included on a particular day. The following SQL code produced from an example cumulative metric is provided for reference: To implement cumulative metrics, refer to the SQL code example: diff --git a/website/docs/docs/build/hooks-operations.md b/website/docs/docs/build/hooks-operations.md index 9ed20291c34..6cec2a673c0 100644 --- a/website/docs/docs/build/hooks-operations.md +++ b/website/docs/docs/build/hooks-operations.md @@ -72,6 +72,41 @@ You can use hooks to provide database-specific functionality not available out-o You can also use a [macro](/docs/build/jinja-macros#macros) to bundle up hook logic. Check out some of the examples in the reference sections for [on-run-start and on-run-end hooks](/reference/project-configs/on-run-start-on-run-end) and [pre- and post-hooks](/reference/resource-configs/pre-hook-post-hook). + + +```sql +{{ config( + pre_hook=[ + "{{ some_macro() }}" + ] +) }} +``` + + + + + +```yaml +models: + - name: + config: + pre_hook: + - "{{ some_macro() }}" +``` + + + + + +```yaml +models: + : + +pre-hook: + - "{{ some_macro() }}" +``` + + + ## About operations Operations are [macros](/docs/build/jinja-macros#macros) that you can run using the [`run-operation`](/reference/commands/run-operation) command. As such, operations aren't actually a separate resource in your dbt project — they are just a convenient way to invoke a macro without needing to run a model. diff --git a/website/docs/docs/build/jinja-macros.md b/website/docs/docs/build/jinja-macros.md index fc4a0cad3e8..bc91e3674c9 100644 --- a/website/docs/docs/build/jinja-macros.md +++ b/website/docs/docs/build/jinja-macros.md @@ -74,7 +74,7 @@ group by 1 You can recognize Jinja based on the delimiters the language uses, which we refer to as "curlies": - **Expressions `{{ ... }}`**: Expressions are used when you want to output a string. You can use expressions to reference [variables](/reference/dbt-jinja-functions/var) and call [macros](/docs/build/jinja-macros#macros). - **Statements `{% ... %}`**: Statements don't output a string. They are used for control flow, for example, to set up `for` loops and `if` statements, to [set](https://jinja.palletsprojects.com/en/3.1.x/templates/#assignments) or [modify](https://jinja.palletsprojects.com/en/3.1.x/templates/#expression-statement) variables, or to define macros. -- **Comments `{# ... #}`**: Jinja comments are used to prevent the text within the comment from executing or outputing a string. +- **Comments `{# ... #}`**: Jinja comments are used to prevent the text within the comment from executing or outputing a string. Don't use `--` for comment. When used in a dbt model, your Jinja needs to compile to a valid query. To check what SQL your Jinja compiles to: * **Using dbt Cloud:** Click the compile button to see the compiled SQL in the Compiled SQL pane diff --git a/website/docs/docs/build/metricflow-time-spine.md b/website/docs/docs/build/metricflow-time-spine.md index 23040459ea4..18acf451a12 100644 --- a/website/docs/docs/build/metricflow-time-spine.md +++ b/website/docs/docs/build/metricflow-time-spine.md @@ -19,7 +19,8 @@ Previously, you were required to create a model called `metricflow_time_spine` i - + + ```yaml models: - name: time_spine_hourly @@ -35,6 +36,7 @@ models: - name: date_day granularity: day # set granularity at column-level for standard_granularity_column ``` + Now, break down the configuration above. It's pointing to a model called `time_spine_daily`. It sets the time spine configurations under the `time_spine` key. The `standard_granularity_column` is the lowest grain of the table, in this case, it's hourly. It needs to reference a column defined under the columns key, in this case, `date_hour`. Use the `standard_granularity_column` as the join key for the time spine table when joining tables in MetricFlow. Here, the granularity of the `standard_granularity_column` is set at the column level, in this case, `hour`. @@ -45,7 +47,7 @@ The example creates a time spine at a daily grain and an hourly grain. A few thi * You can add a time spine for each granularity you intend to use if query efficiency is more important to you than configuration time, or storage constraints. For most engines, the query performance difference should be minimal and transforming your time spine to a coarser grain at query time shouldn't add significant overhead to your queries. * We recommend having a time spine at the finest grain used in any of your dimensions to avoid unexpected errors. i.e., if you have dimensions at an hourly grain, you should have a time spine at an hourly grain. - + @@ -111,16 +113,15 @@ select * from final where date_day > dateadd(year, -4, current_timestamp()) and date_hour < dateadd(day, 30, current_timestamp()) ``` - - +Use this model if you're using BigQuery. BigQuery supports `DATE()` instead of `TO_DATE()`: + + ```sql --- filename: metricflow_time_spine.sql --- BigQuery supports DATE() instead of TO_DATE(). Use this model if you're using BigQuery {{config(materialized='table')}} with days as ( {{dbt_utils.date_spine( @@ -142,14 +143,15 @@ from final where date_day > dateadd(year, -4, current_timestamp()) and date_hour < dateadd(day, 30, current_timestamp()) ``` - + + + ```sql --- filename: metricflow_time_spine.sql --- BigQuery supports DATE() instead of TO_DATE(). Use this model if you're using BigQuery + {{config(materialized='table')}} with days as ( {{dbt.date_spine( @@ -171,14 +173,15 @@ from final where date_day > dateadd(year, -4, current_timestamp()) and date_hour < dateadd(day, 30, current_timestamp()) ``` - + + + ## Hourly time spine ```sql --- filename: metricflow_time_spine_hour.sql {{ config( materialized = 'table', diff --git a/website/docs/docs/build/unit-tests.md b/website/docs/docs/build/unit-tests.md index 55b35721298..1d7143d7476 100644 --- a/website/docs/docs/build/unit-tests.md +++ b/website/docs/docs/build/unit-tests.md @@ -22,7 +22,8 @@ With dbt Core v1.8 and dbt Cloud environments that have gone versionless by sele - We currently only support unit testing SQL models. - We currently only support adding unit tests to models in your _current_ project. -- We currently *don't* support unit testing models that use recursive SQL. +- We currently _don't_ support unit testing models that use the [`materialized view`](/docs/build/materializations#materialized-view) materialization. +- We currently _don't_ support unit testing models that use recursive SQL. - You must specify all fields in a BigQuery STRUCT in a unit test. You cannot use only a subset of fields in a STRUCT. - If your model has multiple versions, by default the unit test will run on *all* versions of your model. Read [unit testing versioned models](/reference/resource-properties/unit-testing-versions) for more information. - Unit tests must be defined in a YML file in your `models/` directory. diff --git a/website/docs/docs/cloud-integrations/avail-sl-integrations.md b/website/docs/docs/cloud-integrations/avail-sl-integrations.md index eea93c92b93..04d9d55acb4 100644 --- a/website/docs/docs/cloud-integrations/avail-sl-integrations.md +++ b/website/docs/docs/cloud-integrations/avail-sl-integrations.md @@ -20,7 +20,7 @@ import AvailIntegrations from '/snippets/_sl-partner-links.md'; ### Custom integration - [Exports](/docs/use-dbt-semantic-layer/exports) enable custom integration with additional tools that don't natively connect with the dbt Semantic Layer, such as PowerBI. -- Develop custom integrations using different languages and tools, supported through JDBC, ADBC, and GraphQL APIs. For more info, check out [our examples on GitHub](https://github.com/dbt-labs/example-semantic-layer-clients/). +- [Consume metrics](/docs/use-dbt-semantic-layer/consume-metrics) and develop custom integrations using different languages and tools, supported through [JDBC](/docs/dbt-cloud-apis/sl-jdbc), ADBC, and [GraphQL](/docs/dbt-cloud-apis/sl-graphql) APIs, and [Python SDK library](/docs/dbt-cloud-apis/sl-python). For more info, check out [our examples on GitHub](https://github.com/dbt-labs/example-semantic-layer-clients/). - Connect to any tool that supports SQL queries. These tools must meet one of the two criteria: - Offers a generic JDBC driver option (such as DataGrip) or - Is compatible Arrow Flight SQL JDBC driver version 12.0.0 or higher. diff --git a/website/docs/docs/cloud-integrations/semantic-layer/excel.md b/website/docs/docs/cloud-integrations/semantic-layer/excel.md index 4f76bfc5c97..e666bda0e58 100644 --- a/website/docs/docs/cloud-integrations/semantic-layer/excel.md +++ b/website/docs/docs/cloud-integrations/semantic-layer/excel.md @@ -39,9 +39,9 @@ import Tools from '/snippets/_sl-excel-gsheets.md'; type="Microsoft Excel" bullet_1="There's a timeout of 1 minute for queries." bullet_2="If you're using this extension, make sure you're signed into Microsoft with the same Excel profile you used to set up the Add-In. Log in with one profile at a time as using multiple profiles at once might cause issues." +queryBuilder="/img/docs/dbt-cloud/semantic-layer/query-builder.png" /> - ## FAQs diff --git a/website/docs/docs/cloud-integrations/semantic-layer/gsheets.md b/website/docs/docs/cloud-integrations/semantic-layer/gsheets.md index b3931f0f528..f215bee9671 100644 --- a/website/docs/docs/cloud-integrations/semantic-layer/gsheets.md +++ b/website/docs/docs/cloud-integrations/semantic-layer/gsheets.md @@ -40,13 +40,15 @@ import Tools from '/snippets/_sl-excel-gsheets.md'; type="Google Sheets" bullet_1="The custom menu operation has a timeout limit of six (6) minutes." bullet_2="If you're using this extension, make sure you're signed into Chrome with the same Google profile you used to set up the Add-On. Log in with one Google profile at a time as using multiple Google profiles at once might cause issues." -queryBuilder="/img/docs/dbt-cloud/semantic-layer/gsheets-query-builder.jpg" +queryBuilder="/img/docs/dbt-cloud/semantic-layer/query-builder.png" +PrivateSelections="You can also make these selections private or public. Public selections mean your inputs are available in the menu to everyone on the sheet. +Private selections mean your inputs are only visible to you. Note that anyone added to the sheet can still see the data from these private selections, but they won't be able to interact with the selection in the menu or benefit from the automatic refresh." /> - + **Limited use policy disclosure** diff --git a/website/docs/docs/cloud/about-cloud/browsers.md b/website/docs/docs/cloud/about-cloud/browsers.md index 12665bc7b72..1e26d3a6d59 100644 --- a/website/docs/docs/cloud/about-cloud/browsers.md +++ b/website/docs/docs/cloud/about-cloud/browsers.md @@ -27,4 +27,4 @@ To improve your experience using dbt Cloud, we suggest that you turn off ad bloc A session is a period of time during which you’re signed in to a dbt Cloud account from a browser. If you close your browser, it will end your session and log you out. You'll need to log in again the next time you try to access dbt Cloud. -If you've logged in using [SSO](/docs/cloud/manage-access/sso-overview) or [OAuth](/docs/cloud/git/connect-github#personally-authenticate-with-github), you can customize your maximum session duration, which might vary depending on your identity provider (IdP). +If you've logged in using [SSO](/docs/cloud/manage-access/sso-overview), you can customize your maximum session duration, which might vary depending on your identity provider (IdP). diff --git a/website/docs/docs/cloud/git/connect-github.md b/website/docs/docs/cloud/git/connect-github.md index 4dc4aaf73e9..f230f70e1f6 100644 --- a/website/docs/docs/cloud/git/connect-github.md +++ b/website/docs/docs/cloud/git/connect-github.md @@ -7,7 +7,6 @@ sidebar_label: "Connect to GitHub" Connecting your GitHub account to dbt Cloud provides convenience and another layer of security to dbt Cloud: -- Log into dbt Cloud using OAuth through GitHub. - Import new GitHub repositories with a couple clicks during dbt Cloud project setup. - Clone repos using HTTPS rather than SSH. - Trigger [Continuous integration](/docs/deploy/continuous-integration)(CI) builds when pull requests are opened in GitHub. @@ -48,15 +47,15 @@ To connect your dbt Cloud account to your GitHub account: - Read and write access to Workflows 6. Once you grant access to the app, you will be redirected back to dbt Cloud and shown a linked account success state. You are now personally authenticated. -7. Ask your team members to [personally authenticate](/docs/cloud/git/connect-github#personally-authenticate-with-github) by connecting their GitHub profiles. +7. Ask your team members to individually authenticate by connecting their [personal GitHub profiles](#authenticate-your-personal-github-account). ## Limiting repository access in GitHub If you are your GitHub organization owner, you can also configure the dbt Cloud GitHub application to have access to only select repositories. This configuration must be done in GitHub, but we provide an easy link in dbt Cloud to start this process. -## Personally authenticate with GitHub +## Authenticate your personal GitHub account -Once the dbt Cloud admin has [set up a connection](/docs/cloud/git/connect-github#installing-dbt-cloud-in-your-github-account) to your organization GitHub account, you need to personally authenticate, which improves the security of dbt Cloud by enabling you to log in using OAuth through GitHub. +After the dbt Cloud administrator [sets up a connection](/docs/cloud/git/connect-github#installing-dbt-cloud-in-your-github-account) to your organization's GitHub account, you need to authenticate using your personal account. You must connect your personal GitHub profile to dbt Cloud to use the [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud) and [CLI](/docs/cloud/cloud-cli-installation) and verify your read and write access to the repository. :::info GitHub profile connection @@ -77,7 +76,7 @@ To connect a personal GitHub account: 4. Once you approve authorization, you will be redirected to dbt Cloud, and you should now see your connected account. -The next time you log into dbt Cloud, you will be able to do so via OAuth through GitHub, and if you're on the Enterprise plan, you're ready to use the dbt Cloud IDE or dbt Cloud CLI. +You can now use the dbt Cloud IDE or dbt Cloud CLI. ## FAQs diff --git a/website/docs/docs/cloud/manage-access/set-up-snowflake-oauth.md b/website/docs/docs/cloud/manage-access/set-up-snowflake-oauth.md index 3b3b9c2d870..e9c4236438e 100644 --- a/website/docs/docs/cloud/manage-access/set-up-snowflake-oauth.md +++ b/website/docs/docs/cloud/manage-access/set-up-snowflake-oauth.md @@ -43,7 +43,7 @@ CREATE OR REPLACE SECURITY INTEGRATION DBT_CLOUD ENABLED = TRUE OAUTH_CLIENT = CUSTOM OAUTH_CLIENT_TYPE = 'CONFIDENTIAL' - OAUTH_REDIRECT_URI = LOCATED_REDIRECT_URI + OAUTH_REDIRECT_URI = 'LOCATED_REDIRECT_URI' OAUTH_ISSUE_REFRESH_TOKENS = TRUE OAUTH_REFRESH_TOKEN_VALIDITY = 7776000; ``` diff --git a/website/docs/docs/cloud/migration.md b/website/docs/docs/cloud/migration.md index 8bdf47eae5a..3aec1956297 100644 --- a/website/docs/docs/cloud/migration.md +++ b/website/docs/docs/cloud/migration.md @@ -7,34 +7,45 @@ pagination_next: null pagination_prev: null --- -dbt Labs is in the process of migrating dbt Cloud to a new _cell-based architecture_. This architecture will be the foundation of dbt Cloud for years to come, and will bring improved scalability, reliability, and security to all customers and users of dbt Cloud. +dbt Labs is in the process of rolling out a new cell-based architecture for dbt Cloud. This architecture provides the foundation of dbt Cloud for years to come, and brings improved reliability, performance, and consistency to users of dbt Cloud. -There is some preparation required to ensure a successful migration. +We're scheduling migrations by account. When we're ready to migrate your account, you will receive a banner or email communication with your migration date. If you have not received this communication, then you don't need to take action at this time. dbt Labs will share information about your migration with you, with appropriate advance notice, when applicable to your account. -Migrations are being scheduled on a per-account basis. _If you haven't received any communication (either with a banner or by email) about a migration date, you don't need to take any action at this time._ dbt Labs will share migration date information with you, with appropriate advance notice, before we complete any migration steps in the dbt Cloud backend. +Your account will be automatically migrated on its scheduled date. However, if you use certain features, you must take action before that date to avoid service disruptions. -This document outlines the steps that you must take to prevent service disruptions before your environment is migrated over to the cell-based architecture. This will impact areas such as login, IP restrictions, and API access. +## Recommended actions -## Pre-migration checklist +We highly recommended you take these actions: -Prior to your migration date, your dbt Cloud account admin will need to make some changes to your account. Most of your configurations will be migrated automatically, but a few will require manual intervention. +- Ensure pending user invitations are accepted or note outstanding invitations. Pending user invitations will be voided during the migration and must be resent after it is complete. +- Commit unsaved changes in the [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud). Unsaved changes will be lost during migration. +- Export and download [audit logs](/docs/cloud/manage-access/audit-log) older than 90 days, as they will be lost during migration. If you lose critical logs older than 90 days during the migration, you will have to work with the dbt Labs Customer Support team to recover. -If your account is scheduled for migration, you will see a banner indicating your migration date when you log in. If you don't see a banner, you don't need to take any action. +## Required actions -1. **IP addresses** — dbt Cloud will be using new IPs to access your warehouse after the migration. Make sure to allow inbound traffic from these IPs in your firewall and include it in any database grants. All six of the IPs below should be added to allowlists. - * Old IPs: `52.45.144.63`, `54.81.134.249`, `52.22.161.231` - * New IPs: `52.3.77.232`, `3.214.191.130`, `34.233.79.135` -2. **User invitations** — Any pending user invitations will be invalidated during the migration. You can resend the invitations after the migration is complete. -3. **SSO integrations** — If you've completed the Auth0 migration, your account SSO configurations will be automatically transferred. If you haven't completed the Auth0 migration, dbt Labs recommends doing that before starting the mult-cell migration to avoid service disruptions. -4. **IDE sessions** — Any unsaved changes in the IDE might be lost during migration. dbt Labs _strongly_ recommends committing all changes in the IDE before your scheduled migration time. +These actions are required to prevent users from losing access dbt Cloud: -## Post-migration +- If you still need to, complete [Auth0 migration for SSO](/docs/cloud/manage-access/auth0-migration) before your scheduled migration date to avoid service disruptions. If you've completed the Auth0 migration, your account SSO configurations will be transferred automatically. +- Update your IP allow lists. dbt Cloud will be using new IPs to access your warehouse post-migration. Allow inbound traffic from all of the following new IPs in your firewall and include them in any database grants: -After migration, if you completed all the [Pre-migration checklist](#pre-migration-checklist) items, your dbt Cloud resources and jobs will continue to work as they did before. + - `52.3.77.232` + - `3.214.191.130` + - `34.233.79.135` -You have the option to log in to dbt Cloud at a different URL: - * If you were previously logging in at `cloud.getdbt.com`, you should instead plan to login at `us1.dbt.com`. The original URL will still work, but you’ll have to click through to be redirected upon login. - * You may also log in directly with your account’s unique [access URL](/docs/cloud/about-cloud/access-regions-ip-addresses#accessing-your-account). + Keep the old dbt Cloud IPs listed until the migration is complete. -:::info Login with GitHub -Users who previously used the "Login with GitHub" functionality will no longer be able to use this method to login to dbt Cloud after migration. To continue accessing your account, you can use your existing email and password. +## Post-migration​ + +Complete all of these items to ensure your dbt Cloud resources and jobs will continue working without interruption. + +Use one of these two URL login options: + +- `us1.dbt.com.` If you were previously logging in at `cloud.getdbt.com`, you should instead plan to log in at us1.dbt.com. The original URL will still work, but you’ll have to click through to be redirected upon login. +- `ACCOUNT_PREFIX.us1.dbt.com`: A unique URL specifically for your account. If you belong to multiple accounts, each will have a unique URL available as long as they have been migrated to multi-cell. +Check out [access, regions, and IP addresses](/docs/cloud/about-cloud/access-regions-ip-addresses) for more information. + +Remove the following old IP addresses from your firewall and database grants: + +- `52.45.144.63` +- `54.81.134.249` +- `52.22.161.231` diff --git a/website/docs/docs/cloud/secure/ip-restrictions.md b/website/docs/docs/cloud/secure/ip-restrictions.md index 034b3a6c144..d39960dab42 100644 --- a/website/docs/docs/cloud/secure/ip-restrictions.md +++ b/website/docs/docs/cloud/secure/ip-restrictions.md @@ -13,7 +13,7 @@ import SetUpPages from '/snippets/_available-tiers-iprestrictions.md'; IP Restrictions help control which IP addresses are allowed to connect to dbt Cloud. IP restrictions allow dbt Cloud customers to meet security and compliance controls by only allowing approved IPs to connect to their dbt Cloud environment. This feature is supported in all regions across NA, Europe, and Asia-Pacific, but contact us if you have questions about availability. -## Configuring IP Restrictions +## Configuring IP restrictions To configure IP restrictions, go to **Account Settings** → **IP Restrictions**. IP restrictions provide two methods for determining which IPs can access dbt Cloud: an allowlist and a blocklist. IPs in the allowlist are allowed to access dbt Cloud, and IPs in the deny list will be blocked from accessing dbt Cloud. IP Restrictions can be used for a range of use cases, including: @@ -29,7 +29,7 @@ For any version control system integrations (Github, Gitlab, ADO, etc.) inbound To add an IP to the allowlist, from the **IP Restrictions** page: -1. Click **edit** +1. Click **Edit** 2. Click **Add Rule** 3. Add name and description for the rule - For example, Corporate VPN CIDR Range @@ -39,7 +39,9 @@ To add an IP to the allowlist, from the **IP Restrictions** page: - You can add multiple ranges in the same rule. 6. Click **Save** -Note that simply adding the IP Ranges will not enforce IP restrictions. For more information, see the section “Enabling Restrictions.” +Add multiple IP ranges by clicking the **Add IP range** button to create a new text field. + +Note that simply adding the IP Ranges will not enforce IP restrictions. For more information, see the [Enabling restrictions](#enabling-restrictions) section. If you only want to allow the IP ranges added to this list and deny all other requests, adding a denylist is not necessary. By default, if only an allow list is added, dbt Cloud will only allow IPs in the allowable range and deny all other IPs. However, you can add a denylist if you want to deny specific IP addresses within your allowlist CIDR range. @@ -65,9 +67,9 @@ It is possible to put an IP range on one list and then a sub-range or IP address ::: -## Enabling Restrictions +## Enabling restrictions -Once you are done adding all your ranges, IP restrictions can be enabled by selecting the **Enable IP restrictions** button and clicking **Save**. If your IP address is in any of the denylist ranges, you won’t be able to save or enable IP restrictions - this is done to prevent accidental account lockouts. If you do get locked out due to IP changes on your end, please reach out to support@dbtlabs.com +Once you are done adding all your ranges, IP restrictions can be enabled by selecting the **Enable IP restrictions** button and clicking **Save**. If your IP address is in any of the denylist ranges, you won’t be able to save or enable IP restrictions - this is done to prevent accidental account lockouts. If you do get locked out due to IP changes on your end, please reach out to support@getdbt.com Once enabled, when someone attempts to access dbt Cloud from a restricted IP, they will encounter one of the following messages depending on whether they use email & password or SSO login. diff --git a/website/docs/docs/cloud/secure/postgres-privatelink.md b/website/docs/docs/cloud/secure/postgres-privatelink.md index 58098f4c23a..864cfe4acba 100644 --- a/website/docs/docs/cloud/secure/postgres-privatelink.md +++ b/website/docs/docs/cloud/secure/postgres-privatelink.md @@ -5,6 +5,7 @@ description: "Configuring PrivateLink for Postgres" sidebar_label: "PrivateLink for Postgres" --- import SetUpPages from '/snippets/_available-tiers-privatelink.md'; +import PrivateLinkTroubleshooting from '/snippets/_privatelink-troubleshooting.md'; @@ -86,3 +87,5 @@ Once dbt Cloud support completes the configuration, you can start creating new c 3. Select the private endpoint from the dropdown (this will automatically populate the hostname/account field). 4. Configure the remaining data platform details. 5. Test your connection and save it. + + \ No newline at end of file diff --git a/website/docs/docs/cloud/secure/redshift-privatelink.md b/website/docs/docs/cloud/secure/redshift-privatelink.md index 23e2b4382fc..a9d4332918b 100644 --- a/website/docs/docs/cloud/secure/redshift-privatelink.md +++ b/website/docs/docs/cloud/secure/redshift-privatelink.md @@ -6,6 +6,7 @@ sidebar_label: "PrivateLink for Redshift" --- import SetUpPages from '/snippets/_available-tiers-privatelink.md'; +import PrivateLinkTroubleshooting from '/snippets/_privatelink-troubleshooting.md'; @@ -115,3 +116,5 @@ Once dbt Cloud support completes the configuration, you can start creating new c 3. Select the private endpoint from the dropdown (this will automatically populate the hostname/account field). 4. Configure the remaining data platform details. 5. Test your connection and save it. + + \ No newline at end of file diff --git a/website/docs/docs/cloud/secure/vcs-privatelink.md b/website/docs/docs/cloud/secure/vcs-privatelink.md index b08154d2e72..6041b1cb4ed 100644 --- a/website/docs/docs/cloud/secure/vcs-privatelink.md +++ b/website/docs/docs/cloud/secure/vcs-privatelink.md @@ -6,6 +6,7 @@ sidebar_label: "PrivateLink for VCS" --- import SetUpPages from '/snippets/_available-tiers-privatelink.md'; +import PrivateLinkTroubleshooting from '/snippets/_privatelink-troubleshooting.md'; @@ -106,3 +107,5 @@ Once dbt confirms that the PrivateLink integration is complete, you can use it i + + \ No newline at end of file diff --git a/website/docs/docs/collaborate/data-tile.md b/website/docs/docs/collaborate/data-tile.md index f40f21ebe18..efd6a0d59aa 100644 --- a/website/docs/docs/collaborate/data-tile.md +++ b/website/docs/docs/collaborate/data-tile.md @@ -9,9 +9,11 @@ image: /img/docs/collaborate/dbt-explorer/data-tile-pass.jpg # Embed data health tile in dashboards With data health tiles, stakeholders will get an at-a-glance confirmation on whether the data they’re looking at is stale or degraded. This trust signal allows teams to immediately go back into Explorer to see more details and investigate issues. + :::info Available in beta Data health tile is currently available in open beta. ::: + The data health tile: - Distills trust signals for data consumers. @@ -19,7 +21,12 @@ The data health tile: - Provides richer information and makes it easier to debug. - Revamps the existing, [job-based tiles](#job-based-data-health). +Data health tiles rely on [exposures](/docs/build/exposures) to surface trust signals in your dashboards. When you configure exposures in your dbt project, you are explicitly defining how specific outputs—like dashboards or reports—depend on your data models. + + + + ## Prerequisites @@ -34,60 +41,102 @@ First, be sure to enable [source freshness](/docs/deploy/source-freshness) in 1. Navigate to dbt Explorer by clicking on the **Explore** link in the navigation. 2. In the main **Overview** page, go to the left navigation. -3. Under the **Resources** tab, click on **Exposures** to view the exposures list. +3. Under the **Resources** tab, click on **Exposures** to view the [exposures](/docs/build/exposures) list. 4. Select a dashboard exposure and go to the **General** tab to view the data health information. -5. In this tab, you’ll see: - - Data health status: Data freshness passed, Data quality passed, Data may be stale, Data quality degraded - - Name of the exposure. +5. In this tab, you’ll see: + - Name of the exposure. + - Data health status: Data freshness passed, Data quality passed, Data may be stale, Data quality degraded. - Resource type (model, source, and so on). - Dashboard status: Failure, Pass, Stale. - You can also see the last check completed, the last check time, and the last check duration. -6. You can also click the **Open Dashboard** button on the upper right to immediately view this in your analytics tool. +6. You can click the **Open Dashboard** button on the upper right to immediately view this in your analytics tool. ## Embed in your dashboard -Once you’ve navigated to the auto-exposure in dbt Explorer, you’ll need to set up your dashboard status tile and [service token](/docs/dbt-cloud-apis/service-tokens): +Once you’ve navigated to the exposure in dbt Explorer, you’ll need to set up your data health tile and [service token](/docs/dbt-cloud-apis/service-tokens). You can embed data health tile to any analytics tool that supports URL or iFrame embedding. + +Follow these steps to set up your data health tile: 1. Go to **Account settings** in dbt Cloud. 2. Select **API tokens** in the left sidebar and then **Service tokens**. 3. Click on **Create service token** and give it a name. -4. Select the [**Metadata Only** permission](/docs/dbt-cloud-apis/service-tokens). This token will be used to embed the exposure tile in your dashboard in the later steps. +4. Select the [**Metadata Only**](/docs/dbt-cloud-apis/service-tokens) permission. This token will be used to embed the tile in your dashboard in the later steps. -5. Copy the **Metadata Only token** and save it in a secure location. You'll need it token in the next steps. +5. Copy the **Metadata Only** token and save it in a secure location. You'll need it token in the next steps. 6. Navigate back to dbt Explorer and select an exposure. 7. Below the **Data health** section, expand on the toggle for instructions on how to embed the exposure tile (if you're an account admin with develop permissions). 8. In the expanded toggle, you'll see a text field where you can paste your **Metadata Only token**. -9. Once you’ve pasted your token, you can select either **URL** or **iFrame** depending on which you need to install into your dashboard. +9. Once you’ve pasted your token, you can select either **URL** or **iFrame** depending on which you need to add to your dashboard. If your analytics tool supports iFrames, you can embed the dashboard tile within it. -### Embed data health tile in Tableau -To embed the data health tile in Tableau, follow these steps: +### Examples +The following examples show how to embed the data health tile in Tableau and PowerBI. + + -1. Ensure you've copied the embed iFrame content in dbt Explorer. -2. For the revamped environment-based exposure tile you can insert these fields into the following iFrame, and then embed them with your dashboard. This is the iFrame that is available from the **Exposure details** page in dbt Explorer. + - `" + ``` - `