+
Example of common support questions
+
Types of dbt Cloud-related questions our Support team can assist you with, regardless of your dbt Cloud plan:
How do I...
- set up a dbt Cloud project?
diff --git a/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md b/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md
deleted file mode 100644
index 544590b18df..00000000000
--- a/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md
+++ /dev/null
@@ -1,27 +0,0 @@
----
-title: "Upgrading to v1.9 (beta)"
-id: upgrading-to-v1.9
-description: New features and changes in dbt Core v1.9
-displayed_sidebar: "docs"
----
-
-## Resources
-
-- Changelog (coming soon)
-- [dbt Core CLI Installation guide](/docs/core/installation-overview)
-- [Cloud upgrade guide](/docs/dbt-versions/upgrade-dbt-version-in-cloud) — dbt Cloud is now versionless. dbt v1.9 will not appear in the version dropdown. Select **Versionless** to get all the latest features and functionality in your dbt Cloud account.
-
-## What to know before upgrading
-
-dbt Labs is committed to providing backward compatibility for all versions 1.x, except for any changes explicitly mentioned on this page. If you encounter an error upon upgrading, please let us know by [opening an issue](https://github.com/dbt-labs/dbt-core/issues/new).
-
-
-## New and changed features and functionality
-
-Features and functionality new in dbt v1.9.
-
-**Coming soon**
-
-## Quick hits
-
-**Coming soon**
\ No newline at end of file
diff --git a/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.8.md b/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.8.md
index dd22329668c..9163047e7e0 100644
--- a/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.8.md
+++ b/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.8.md
@@ -98,13 +98,13 @@ The ability for installed packages to override built-in materializations without
### Managing changes to legacy behaviors
-dbt Core v1.8 has introduced flags for [managing changes to legacy behaviors](/reference/global-configs/legacy-behaviors). You may opt into recently introduced changes (disabled by default), or opt out of mature changes (enabled by default), by setting `True` / `False` values, respectively, for `flags` in `dbt_project.yml`.
+dbt Core v1.8 has introduced flags for [managing changes to legacy behaviors](/reference/global-configs/behavior-changes). You may opt into recently introduced changes (disabled by default), or opt out of mature changes (enabled by default), by setting `True` / `False` values, respectively, for `flags` in `dbt_project.yml`.
You can read more about each of these behavior changes in the following links:
-- (Mature, enabled by default) [Require explicit package overrides for builtin materializations](/reference/global-configs/legacy-behaviors#require_explicit_package_overrides_for_builtin_materializations)
-- (Introduced, disabled by default) [Require resource names without spaces](https://docs.getdbt.com/reference/global-configs/legacy-behaviors#require_resource_names_without_spaces)
-- (Introduced, disabled by default) [Run project hooks (`on-run-*`) in the `dbt source freshness` command](/reference/global-configs/legacy-behaviors#source_freshness_run_project_hooks)
+- (Mature, enabled by default) [Require explicit package overrides for builtin materializations](/reference/global-configs/behavior-changes#require_explicit_package_overrides_for_builtin_materializations)
+- (Introduced, disabled by default) [Require resource names without spaces](/reference/global-configs/behavior-changes#require_resource_names_without_spaces)
+- (Introduced, disabled by default) [Run project hooks (`on-run-*`) in the `dbt source freshness` command](/reference/global-configs/behavior-changes#source_freshness_run_project_hooks)
## Quick hits
diff --git a/website/docs/docs/dbt-versions/release-notes.md b/website/docs/docs/dbt-versions/release-notes.md
index a9db34334ad..5e4d0d40082 100644
--- a/website/docs/docs/dbt-versions/release-notes.md
+++ b/website/docs/docs/dbt-versions/release-notes.md
@@ -19,9 +19,13 @@ Release notes are grouped by month for both multi-tenant and virtual private clo
\* The official release date for this new format of release notes is May 15th, 2024. Historical release notes for prior dates may not reflect all available features released earlier this year or their tenancy availability.
## August 2024
+- **Fix:** Fixed an issue in [dbt Explorer](/docs/collaborate/explore-projects) where navigating to a consumer project from a public node resulted in displaying a random public model rather than the original selection.
- **New**: You can now configure metrics at granularities at finer time grains, such as hour, minute, or even by the second. This is particularly useful for more detailed analysis and for datasets where high-resolution time data is required, such as minute-by-minute event tracking. Refer to [dimensions](/docs/build/dimensions) for more information about time granularity.
+- **Enhancement**: Microsoft Excel now supports [saved selections](/docs/cloud-integrations/semantic-layer/excel#using-saved-selections) and [saved queries](/docs/cloud-integrations/semantic-layer/excel#using-saved-queries). Use Saved selections to save your query selections within the Excel application. The application also clears stale data in [trailing rows](/docs/cloud-integrations/semantic-layer/excel#other-settings) by default. To return your results and keep any previously selected data intact, un-select the **Clear trailing rows** option.
+- **Behavior change:** GitHub is no longer supported for OAuth login to dbt Cloud. Use a supported [SSO or OAuth provider](/docs/cloud/manage-access/sso-overview) to securely manage access to your dbt Cloud account.
## July 2024
+- **Behavior change:** `target_schema` is no longer a required configuration for [snapshots](/docs/build/snapshots). You can now target different schemas for snapshots across development and deployment environments using the [schema config](/reference/resource-configs/schema).
- **New:** [Connections](/docs/cloud/connect-data-platform/about-connections#connection-management) are now available under **Account settings** as a global setting. Previously, they were found under **Project settings**. This is being rolled out in phases over the coming weeks.
- **New:** Admins can now assign [environment-level permissions](/docs/cloud/manage-access/environment-permissions) to groups for specific roles.
- **New:** [Merge jobs](/docs/deploy/merge-jobs) for implementing [continuous deployment (CD)](/docs/deploy/continuous-deployment) workflows are now GA in dbt Cloud. Previously, you had to either set up a custom GitHub action or manually build the changes every time a pull request is merged.
@@ -147,7 +151,7 @@ The following features are new or enhanced as part of our [dbt Cloud Launch Show
-- **Behavior change:** Introduced the `require_resource_names_without_spaces` flag, opt-in and disabled by default. If set to `True`, dbt will raise an exception if it finds a resource name containing a space in your project or an installed package. This will become the default in a future version of dbt. Read [No spaces in resource names](/reference/global-configs/legacy-behaviors#no-spaces-in-resource-names) for more information.
+- **Behavior change:** Introduced the `require_resource_names_without_spaces` flag, opt-in and disabled by default. If set to `True`, dbt will raise an exception if it finds a resource name containing a space in your project or an installed package. This will become the default in a future version of dbt. Read [No spaces in resource names](/reference/global-configs/behavior-changes#no-spaces-in-resource-names) for more information.
## April 2024
@@ -159,7 +163,7 @@ The following features are new or enhanced as part of our [dbt Cloud Launch Show
-- **Behavior change:** Introduced the `require_explicit_package_overrides_for_builtin_materializations` flag, opt-in and disabled by default. If set to `True`, dbt will only use built-in materializations defined in the root project or within dbt, rather than implementations in packages. This will become the default in May 2024 (dbt Core v1.8 and "Versionless" dbt Cloud). Read [Package override for built-in materialization](/reference/global-configs/legacy-behaviors#package-override-for-built-in-materialization) for more information.
+- **Behavior change:** Introduced the `require_explicit_package_overrides_for_builtin_materializations` flag, opt-in and disabled by default. If set to `True`, dbt will only use built-in materializations defined in the root project or within dbt, rather than implementations in packages. This will become the default in May 2024 (dbt Core v1.8 and "Versionless" dbt Cloud). Read [Package override for built-in materialization](/reference/global-configs/behavior-changes#package-override-for-built-in-materialization) for more information.
**dbt Semantic Layer**
- **New**: Use Saved selections to [save your query selections](/docs/cloud-integrations/semantic-layer/gsheets#using-saved-selections) within the [Google Sheets application](/docs/cloud-integrations/semantic-layer/gsheets). They can be made private or public and refresh upon loading.
@@ -181,7 +185,7 @@ The following features are new or enhanced as part of our [dbt Cloud Launch Show
- **Fix:** `dbt parse` no longer shows an error when you use a list of filters (instead of just a string filter) on a metric.
- **Fix:** `join_to_timespine` now properly gets applied to conversion metric input measures.
- **Fix:** Fixed an issue where exports in Redshift were not always committing to the DWH, which also had the side-effect of leaving table locks open.
-- **Behavior change:** Introduced the `source_freshness_run_project_hooks` flag, opt-in and disabled by default. If set to `True`, dbt will include `on-run-*` project hooks in the `source freshness` command. This will become the default in a future version of dbt. Read [Project hooks with source freshness](/reference/global-configs/legacy-behaviors#project-hooks-with-source-freshness) for more information.
+- **Behavior change:** Introduced the `source_freshness_run_project_hooks` flag, opt-in and disabled by default. If set to `True`, dbt will include `on-run-*` project hooks in the `source freshness` command. This will become the default in a future version of dbt. Read [Project hooks with source freshness](/reference/global-configs/behavior-changes#project-hooks-with-source-freshness) for more information.
## February 2024
diff --git a/website/docs/docs/use-dbt-semantic-layer/consume-metrics.md b/website/docs/docs/use-dbt-semantic-layer/consume-metrics.md
new file mode 100644
index 00000000000..c55b4bcb632
--- /dev/null
+++ b/website/docs/docs/use-dbt-semantic-layer/consume-metrics.md
@@ -0,0 +1,38 @@
+---
+title: "Consume metrics from your Semantic Layer"
+description: "Learn how to query and consume metrics from your deployed dbt Semantic Layer using various tools and APIs."
+sidebar_label: "Consume your metrics"
+tags: [Semantic Layer]
+pagination_next: "docs/use-dbt-semantic-layer/sl-faqs"
+---
+
+After [deploying](/docs/use-dbt-semantic-layer/deploy-sl) your dbt Semantic Layer, the next important (and fun!) step is querying and consuming the metrics you’ve defined. This page links to key resources that guide you through the process of consuming metrics across different integrations, APIs, and tools, using various different [query syntaxes](/docs/dbt-cloud-apis/sl-jdbc#querying-the-api-for-metric-metadata).
+
+Once your Semantic Layer is deployed, you can start querying your metrics using a variety of tools and APIs. Here are the main resources to get you started:
+
+### Available integrations
+
+Integrate the dbt Semantic Layer with a variety of business intelligence (BI) tools and data platforms, enabling seamless metric queries within your existing workflows. Explore the following integrations:
+
+- [Available integrations](/docs/cloud-integrations/avail-sl-integrations) — Review a wide range of partners such as Tableau, Google Sheets, Microsoft Excel, and more, where you can query your metrics directly from the dbt Semantic Layer.
+
+### Query with APIs
+
+To leverage the full power of the dbt Semantic Layer, you can use the dbt Semantic Layer APIs for querying metrics programmatically:
+- [dbt Semantic Layer APIs](/docs/dbt-cloud-apis/sl-api-overview) — Learn how to use the dbt Semantic Layer APIs to query metrics in downstream tools, ensuring consistent and reliable data metrics.
+ - [JDBC API query syntax](/docs/dbt-cloud-apis/sl-jdbc#querying-the-api-for-metric-metadata) — Dive into the syntax for querying metrics with the JDBC API, with examples and detailed instructions.
+ - [GraphQL API query syntax](/docs/dbt-cloud-apis/sl-graphql#querying) — Learn the syntax for querying metrics via the GraphQL API, including examples and detailed instructions.
+ - [Python SDK](/docs/dbt-cloud-apis/sl-python#usage-examples) — Use the Python SDK library to query metrics programmatically with Python.
+
+### Query during development
+
+For developers working within the dbt ecosystem, it’s essential to understand how to query metrics during the development phase using MetricFlow commands:
+- [MetricFlow commands](/docs/build/metricflow-commands) — Learn how to use MetricFlow commands to query metrics directly during the development process, ensuring your metrics are correctly defined and working as expected.
+
+## Next steps
+
+After understanding the basics of querying metrics, consider optimizing your setup and ensuring the integrity of your metric definitions:
+
+- [Optimize querying performance](/docs/use-dbt-semantic-layer/sl-cache) — Improve query speed and efficiency by using declarative caching techniques.
+- [Validate semantic nodes in CI](/docs/deploy/ci-jobs#semantic-validations-in-ci) — Ensure that any changes to dbt models don’t break your metrics by validating semantic nodes in Continuous Integration (CI) jobs.
+- [Build your metrics and semantic models](/docs/build/build-metrics-intro) — If you haven’t already, learn how to define and build your metrics and semantic models using your preferred development tool.
diff --git a/website/docs/docs/use-dbt-semantic-layer/dbt-sl.md b/website/docs/docs/use-dbt-semantic-layer/dbt-sl.md
index 73e39589587..e09a68b97c4 100644
--- a/website/docs/docs/use-dbt-semantic-layer/dbt-sl.md
+++ b/website/docs/docs/use-dbt-semantic-layer/dbt-sl.md
@@ -4,7 +4,7 @@ id: dbt-sl
description: "Learn how the dbt Semantic Layer enables data teams to centrally define and query metrics."
sidebar_label: "About the dbt Semantic Layer"
tags: [Semantic Layer]
-hide_table_of_contents: true
+hide_table_of_contents: false
pagination_next: "guides/sl-snowflake-qs"
pagination_prev: null
---
@@ -15,7 +15,8 @@ Moving metric definitions out of the BI layer and into the modeling layer allows
Refer to the [dbt Semantic Layer FAQs](/docs/use-dbt-semantic-layer/sl-faqs) or [Why we need a universal semantic layer](https://www.getdbt.com/blog/universal-semantic-layer/) blog post to learn more.
-## Explore the dbt Semantic Layer
+## Get started with the dbt Semantic Layer
+
import Features from '/snippets/_sl-plan-info.md'
@@ -25,54 +26,28 @@ product="dbt Semantic Layer"
plan="dbt Cloud Team or Enterprise"
/>
-
-
-
-
-
+This page points to various resources available to help you understand, configure, deploy, and integrate the dbt Semantic Layer. The following sections contain links to specific pages that explain each aspect in detail. Use these links to navigate directly to the information you need, whether you're setting up the Semantic Layer for the first time, deploying metrics, or integrating with downstream tools.
-
-
+Refer to the following resources to get started with the dbt Semantic Layer:
+- [Quickstart with the dbt Cloud Semantic Layer](/guides/sl-snowflake-qs) — Build and define metrics, set up the dbt Semantic Layer, and query them using our first-class integrations.
+- [dbt Semantic Layer FAQs](/docs/use-dbt-semantic-layer/sl-faqs) — Discover answers to frequently asked questions about the dbt Semantic Layer, such as availability, integrations, and more.
-
+## Configure the dbt Semantic Layer
-
+The following resources provide information on how to configure the dbt Semantic Layer:
+- [Set up the dbt Semantic Layer](/docs/use-dbt-semantic-layer/setup-sl) — Learn how to set up the dbt Semantic Layer in dbt Cloud using intuitive navigation.
+- [Architecture](/docs/use-dbt-semantic-layer/sl-architecture) — Explore the powerful components that make up the dbt Semantic Layer.
-
+## Deploy metrics
+This section provides information on how to deploy the dbt Semantic Layer and materialize your metrics:
+- [Deploy your Semantic Layer](/docs/use-dbt-semantic-layer/deploy-sl) — Run a dbt Cloud job to deploy the dbt Semantic Layer and materialize your metrics.
+- [Write queries with exports](/docs/use-dbt-semantic-layer/exports) — Use exports to write commonly used queries directly within your data platform, on a schedule.
+- [Cache common queries](/docs/use-dbt-semantic-layer/sl-cache) — Leverage result caching and declarative caching for common queries to speed up performance and reduce query computation.
-
+## Consume metrics and integrate
+Consume metrics and integrate the dbt Semantic Layer with downstream tools and applications:
+- [Consume metrics](/docs/use-dbt-semantic-layer/consume-metrics) — Query and consume metrics in downstream tools and applications using the dbt Semantic Layer.
+- [Available integrations](/docs/cloud-integrations/avail-sl-integrations) — Review a wide range of partners you can integrate and query with the dbt Semantic Layer.
+- [dbt Semantic Layer APIs](/docs/dbt-cloud-apis/sl-api-overview) — Use the dbt Semantic Layer APIs to query metrics in downstream tools for consistent, reliable data metrics.
-
diff --git a/website/docs/docs/use-dbt-semantic-layer/deploy-sl.md b/website/docs/docs/use-dbt-semantic-layer/deploy-sl.md
new file mode 100644
index 00000000000..1bee5480317
--- /dev/null
+++ b/website/docs/docs/use-dbt-semantic-layer/deploy-sl.md
@@ -0,0 +1,29 @@
+---
+title: "Deploy your metrics"
+id: deploy-sl
+description: "Deploy the dbt Semantic Layer in dbt Cloud by running a job to materialize your metrics."
+sidebar_label: "Deploy your metrics"
+tags: [Semantic Layer]
+pagination_next: "docs/use-dbt-semantic-layer/exports"
+---
+
+
+
+import RunProdJob from '/snippets/_sl-run-prod-job.md';
+
+
+
+## Next steps
+After you've executed a job and deployed your Semantic Layer:
+- [Set up your Semantic Layer](/docs/use-dbt-semantic-layer/setup-sl) in dbt Cloud. g
+- Discover the [available integrations](/docs/cloud-integrations/avail-sl-integrations), such as Tableau, Google Sheets, Microsoft Excel, and more.
+- Start querying your metrics with the [API query syntax](/docs/dbt-cloud-apis/sl-jdbc#querying-the-api-for-metric-metadata).
+
+
+## Related docs
+- [Optimize querying performance](/docs/use-dbt-semantic-layer/sl-cache) using declarative caching.
+- [Validate semantic nodes in CI](/docs/deploy/ci-jobs#semantic-validations-in-ci) to ensure code changes made to dbt models don't break these metrics.
+- If you haven't already, learn how to [build you metrics and semantic models](/docs/build/build-metrics-intro) in your development tool of choice.
diff --git a/website/docs/docs/use-dbt-semantic-layer/setup-sl.md b/website/docs/docs/use-dbt-semantic-layer/setup-sl.md
index adad5bd9fd1..3dfa7f3aa7d 100644
--- a/website/docs/docs/use-dbt-semantic-layer/setup-sl.md
+++ b/website/docs/docs/use-dbt-semantic-layer/setup-sl.md
@@ -2,8 +2,10 @@
title: "Set up the dbt Semantic Layer"
id: setup-sl
description: "Seamlessly set up the dbt Semantic Layer in dbt Cloud using intuitive navigation."
-sidebar_label: "Set up your Semantic Layer"
+sidebar_label: "Set up the Semantic Layer"
tags: [Semantic Layer]
+pagination_next: "docs/use-dbt-semantic-layer/sl-architecture"
+pagination_prev: "guides/sl-snowflake-qs"
---
With the dbt Semantic Layer, you can centrally define business metrics, reduce code duplication and inconsistency, create self-service in downstream tools, and more.
diff --git a/website/docs/docs/use-dbt-semantic-layer/sl-architecture.md b/website/docs/docs/use-dbt-semantic-layer/sl-architecture.md
index 2062f9e405e..9239275ebdf 100644
--- a/website/docs/docs/use-dbt-semantic-layer/sl-architecture.md
+++ b/website/docs/docs/use-dbt-semantic-layer/sl-architecture.md
@@ -4,7 +4,6 @@ id: sl-architecture
description: "dbt Semantic Layer product architecture and related questions."
sidebar_label: "Semantic Layer architecture"
tags: [Semantic Layer]
-pagination_next: null
---
The dbt Semantic Layer allows you to define metrics and use various interfaces to query them. The Semantic Layer does the heavy lifting to find where the queried data exists in your data platform and generates the SQL to make the request (including performing joins).
diff --git a/website/docs/faqs/Troubleshooting/generate-har-file.md b/website/docs/faqs/Troubleshooting/generate-har-file.md
new file mode 100644
index 00000000000..0cb16711942
--- /dev/null
+++ b/website/docs/faqs/Troubleshooting/generate-har-file.md
@@ -0,0 +1,71 @@
+---
+title: "How to generate HAR files"
+description: "How to generate HAR files for debugging"
+sidebar_label: 'Generate HAR files'
+sidebar_position: 1
+keywords:
+ - HAR files
+ - HTTP Archive
+ - Troubleshooting
+ - Debugging
+---
+
+HTTP Archive (HAR) files are used to gather data from users’ browser, which dbt Support uses to troubleshoot network or resource issues. This information includes detailed timing information about the requests made between the browser and the server.
+
+The following sections describe how to generate HAR files using common browsers such as [Google Chrome](#google-chrome), [Mozilla Firefox](#mozilla-firefox), [Apple Safari](#apple-safari), and [Microsoft Edge](#microsoft-edge).
+
+:::info
+Remove or hide any confidential or personally identifying information before you send the HAR file to dbt Labs. You can edit the file using a text editor.
+:::
+
+### Google Chrome
+
+1. Open Google Chrome.
+2. Click on **View** --> **Developer Tools**.
+3. Select the **Network** tab.
+4. Ensure that Google Chrome is recording. A red button (🔴) indicates that a recording is already in progress. Otherwise, click **Record network log**.
+5. Select **Preserve Log**.
+6. Clear any existing logs by clicking **Clear network log** (🚫).
+7. Go to the page where the issue occurred and reproduce the issue.
+8. Click **Export HAR** (the down arrow icon) to export the file as HAR. The icon is located on the same row as the **Clear network log** button.
+9. Save the HAR file.
+10. Upload the HAR file to the dbt Support ticket thread.
+
+### Mozilla Firefox
+
+1. Open Firefox.
+2. Click the application menu and then **More tools** --> **Web Developer Tools**.
+3. In the developer tools docked tab, select **Network**.
+4. Go to the page where the issue occurred and reproduce the issue. The page automatically starts recording as you navigate.
+5. When you're finished, click **Pause/Resume recording network log**.
+6. Right-click anywhere in the **File** column and select **Save All as HAR**.
+7. Save the HAR file.
+8. Upload the HAR file to the dbt Support ticket thread.
+
+### Apple Safari
+
+1. Open Safari.
+2. In case the **Develop** menu doesn't appear in the menu bar, go to **Safari** and then **Settings**.
+3. Click **Advanced**.
+4. Select the **Show features for web developers** checkbox.
+5. From the **Develop** menu, select **Show Web Inspector**.
+6. Click the **Network tab**.
+7. Go to the page where the issue occurred and reproduce the issue.
+8. When you're finished, click **Export**.
+9. Save the file.
+10. Upload the HAR file to the dbt Support ticket thread.
+
+### Microsoft Edge
+
+1. Open Microsoft Edge.
+2. Click the **Settings and more** menu (...) to the right of the toolbar and then select **More tools** --> **Developer tools**.
+3. Click **Network**.
+4. Ensure that Microsoft Edge is recording. A red button (🔴) indicates that a recording is already in progress. Otherwise, click **Record network log**.
+5. Go to the page where the issue occurred and reproduce the issue.
+6. When you're finished, click **Stop recording network log**.
+7. Click **Export HAR** (the down arrow icon) or press **Ctrl + S** to export the file as HAR.
+8. Save the HAR file.
+9. Upload the HAR file to the dbt Support ticket thread.
+
+### Additional resources
+Check out the [How to generate a HAR file in Chrome](https://www.loom.com/share/cabdb7be338243f188eb619b4d1d79ca) video for a visual guide on how to generate HAR files in Chrome.
diff --git a/website/docs/guides/core-cloud-2.md b/website/docs/guides/core-cloud-2.md
index 93e9e92bfa4..cee1e8029c2 100644
--- a/website/docs/guides/core-cloud-2.md
+++ b/website/docs/guides/core-cloud-2.md
@@ -20,6 +20,20 @@ import CoretoCloudTable from '/snippets/_core-to-cloud-guide-table.md';
+
+
+ - dbt Cloud is the fastest and most reliable way to deploy dbt. It enables you to develop, test, deploy, and explore data products using a single, fully managed service. It also supports:
+ - Development experiences tailored to multiple personas ([dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud) or [dbt Cloud CLI](/docs/cloud/cloud-cli-installation))
+ - Out-of-the-box [CI/CD workflows](/docs/deploy/ci-jobs)
+ - The [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) for consistent metrics
+ - Domain ownership of data with multi-project [dbt Mesh](/best-practices/how-we-mesh/mesh-1-intro) setups
+ - [dbt Explorer](/docs/collaborate/explore-projects) for easier data discovery and understanding
+
+ Learn more about [dbt Cloud features](/docs/cloud/about-cloud/dbt-cloud-features).
+- dbt Core is an open-source tool that enables data teams to define and execute data transformations in a cloud data warehouse following analytics engineering best practices. While this can work well for ‘single players’ and small technical teams, all development happens on a command-line interface, and production deployments must be self-hosted and maintained. This requires significant, costly work that adds up over time to maintain and scale.
+
+
+
## What you'll learn
Today thousands of companies, with data teams ranging in size from 2 to 2,000, rely on dbt Cloud to accelerate data work, increase collaboration, and win the trust of the business. Understanding what you'll need to do in order to move between dbt Cloud and your current Core deployment will help you strategize and plan for your move.
@@ -182,6 +196,7 @@ This guide should now have given you some insight and equipped you with a framew
+
Congratulations on finishing this guide, we hope it's given you insight into the considerations you need to take to best plan your move to dbt Cloud.
For the next steps, you can continue exploring our 3-part-guide series on moving from dbt Core to dbt Cloud:
diff --git a/website/docs/guides/core-to-cloud-1.md b/website/docs/guides/core-to-cloud-1.md
index 171e844d7e5..99c6ed82bf1 100644
--- a/website/docs/guides/core-to-cloud-1.md
+++ b/website/docs/guides/core-to-cloud-1.md
@@ -24,17 +24,19 @@ import CoretoCloudTable from '/snippets/_core-to-cloud-guide-table.md';
+
-dbt Cloud is the fastest and most reliable way to deploy dbt. It enables you to develop, test, deploy, and explore data products using a single, fully managed service. It also supports:
-- Development experiences tailored to multiple personas ([dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud) or [dbt Cloud CLI](/docs/cloud/cloud-cli-installation))
-- Out-of-the-box [CI/CD workflows](/docs/deploy/ci-jobs)
-- The [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) for consistent metrics
-- Domain ownership of data with multi-project [dbt Mesh](/best-practices/how-we-mesh/mesh-1-intro) setups
-- [dbt Explorer](/docs/collaborate/explore-projects) for easier data discovery and understanding
+ - dbt Cloud is the fastest and most reliable way to deploy dbt. It enables you to develop, test, deploy, and explore data products using a single, fully managed service. It also supports:
+ - Development experiences tailored to multiple personas ([dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud) or [dbt Cloud CLI](/docs/cloud/cloud-cli-installation))
+ - Out-of-the-box [CI/CD workflows](/docs/deploy/ci-jobs)
+ - The [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) for consistent metrics
+ - Domain ownership of data with multi-project [dbt Mesh](/best-practices/how-we-mesh/mesh-1-intro) setups
+ - [dbt Explorer](/docs/collaborate/explore-projects) for easier data discovery and understanding
-Learn more about [dbt Cloud features](/docs/cloud/about-cloud/dbt-cloud-features).
+ Learn more about [dbt Cloud features](/docs/cloud/about-cloud/dbt-cloud-features).
+- dbt Core is an open-source tool that enables data teams to define and execute data transformations in a cloud data warehouse following analytics engineering best practices. While this can work well for ‘single players’ and small technical teams, all development happens on a command-line interface, and production deployments must be self-hosted and maintained. This requires significant, costly work that adds up over time to maintain and scale.
-dbt Core is an open-source tool that enables data teams to define and execute data transformations in a cloud data warehouse following analytics engineering best practices. While this can work well for ‘single players’ and small technical teams, all development happens on a command-line interface, and production deployments must be self-hosted and maintained. This requires significant, costly work that adds up over time to maintain and scale.
+
## What you'll learn
@@ -57,7 +59,7 @@ This guide outlines the steps you need to take to move from dbt Core to dbt Clou
## Prerequisites
- You have an existing dbt Core project connected to a Git repository and data platform supported in [dbt Cloud](/docs/cloud/connect-data-platform/about-connections).
-- A [supported version](/docs/dbt-versions/core) of dbt or select [**Versionless**](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) of dbt.
+- A [supported version](/docs/dbt-versions/core) of dbt or select [**Versionless**](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) of dbt.
- You have a dbt Cloud account. **[Don't have one? Start your free trial today](https://www.getdbt.com/signup)**!
## Account setup
@@ -84,8 +86,10 @@ This section outlines the considerations and methods to connect your data platfo
1. In dbt Cloud, set up your [data platform connections](/docs/cloud/connect-data-platform/about-connections) and [environment variables](/docs/build/environment-variables). dbt Cloud can connect with a variety of data platform providers including:
- [AlloyDB](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb)
+ - [Amazon Athena](/docs/cloud/connect-data-platform/connect-amazon-athena) (beta)
- [Amazon Redshift](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb)
- [Apache Spark](/docs/cloud/connect-data-platform/connect-apache-spark)
+ - [Azure Synapse Analytics](/docs/cloud/connect-data-platform/connect-azure-synapse-analytics)
- [Databricks](/docs/cloud/connect-data-platform/connect-databricks)
- [Google BigQuery](/docs/cloud/connect-data-platform/connect-bigquery)
- [Microsoft Fabric](/docs/cloud/connect-data-platform/connect-microsoft-fabric)
@@ -230,6 +234,8 @@ Explore these additional configurations to optimize your dbt Cloud orchestration
Building a custom solution to efficiently check code upon pull requests is complicated. With dbt Cloud, you can enable [continuous integration / continuous deployment (CI/CD)](/docs/deploy/continuous-integration) and configure dbt Cloud to run your dbt projects in a temporary schema when new commits are pushed to open pull requests.
+
+
This build-on-PR functionality is a great way to catch bugs before deploying to production, and an essential tool for data practitioners.
1. Set up an integration with a native Git application (such as Azure DevOps, GitHub, GitLab) and a CI environment in dbt Cloud.
diff --git a/website/docs/guides/custom-cicd-pipelines.md b/website/docs/guides/custom-cicd-pipelines.md
index 59a7767c69b..be23524d096 100644
--- a/website/docs/guides/custom-cicd-pipelines.md
+++ b/website/docs/guides/custom-cicd-pipelines.md
@@ -10,6 +10,9 @@ hide_table_of_contents: true
tags: ['dbt Cloud', 'Orchestration', 'CI']
level: 'Intermediate'
recently_updated: true
+search_weight: "heavy"
+keywords:
+ - bitbucket pipeline, custom pipelines, github, gitlab, azure devops, ci/cd custom pipeline
---
@@ -19,7 +22,6 @@ One of the core tenets of dbt is that analytic code should be version controlled
A note on parlance in this article since each code hosting platform uses different terms for similar concepts. The terms `pull request` (PR) and `merge request` (MR) are used interchangeably to mean the process of merging one branch into another branch.
-
### What are pipelines?
Pipelines (which are known by many names, such as workflows, actions, or build steps) are a series of pre-defined jobs that are triggered by specific events in your repository (PR created, commit pushed, branch merged, etc). Those jobs can do pretty much anything your heart desires assuming you have the proper security access and coding chops.
diff --git a/website/docs/guides/debug-errors.md b/website/docs/guides/debug-errors.md
index 11f02f325a4..58776fa181f 100644
--- a/website/docs/guides/debug-errors.md
+++ b/website/docs/guides/debug-errors.md
@@ -390,4 +390,17 @@ _(More likely for dbt Core users)_
If you just opened a SQL file in the `target/` directory to help debug an issue, it's not uncommon to accidentally edit that file! To avoid this, try changing your code editor settings to grey out any files in the `target/` directory — the visual cue will help avoid the issue.
-
\ No newline at end of file
+## FAQs
+
+Here are some useful FAQs to help you debug your dbt project:
+
+-
+-
+-
+-
+-
+-
+-
+-
+
+
diff --git a/website/docs/guides/sl-snowflake-qs.md b/website/docs/guides/sl-snowflake-qs.md
index 6d9f88ab159..fb72ee0057e 100644
--- a/website/docs/guides/sl-snowflake-qs.md
+++ b/website/docs/guides/sl-snowflake-qs.md
@@ -619,6 +619,11 @@ select * from final
In the following steps, semantic models enable you to define how to interpret the data related to orders. It includes entities (like ID columns serving as keys for joining data), dimensions (for grouping or filtering data), and measures (for data aggregations).
1. In the `metrics` sub-directory, create a new file `fct_orders.yml`.
+
+:::tip
+Make sure to save all semantic models and metrics under the directory defined in the [`model-paths`](/reference/project-configs/model-paths) (or a subdirectory of it, like `models/semantic_models/`). If you save them outside of this path, it will result in an empty `semantic_manifest.json` file, and your semantic models or metrics won't be recognized.
+:::
+
2. Add the following code to that newly created file:
@@ -765,7 +770,11 @@ There are different types of metrics you can configure:
Once you've created your semantic models, it's time to start referencing those measures you made to create some metrics:
-Add metrics to your `fct_orders.yml` semantic model file:
+1. Add metrics to your `fct_orders.yml` semantic model file:
+
+:::tip
+Make sure to save all semantic models and metrics under the directory defined in the [`model-paths`](/reference/project-configs/model-paths) (or a subdirectory of it, like `models/semantic_models/`). If you save them outside of this path, it will result in an empty `semantic_manifest.json` file, and your semantic models or metrics won't be recognized.
+:::
@@ -946,15 +955,6 @@ https://github.com/dbt-labs/docs.getdbt.com/blob/current/website/snippets/_sl-ru
-
-
-What’s happening internally?
-
-- Merging the code into your main branch allows dbt Cloud to pull those changes and build the definition in the manifest produced by the run.
-- Re-running the job in the deployment environment helps materialize the models, which the metrics depend on, in the data platform. It also makes sure that the manifest is up to date.
-- The Semantic Layer APIs pull in the most recent manifest and enables your integration to extract metadata from it.
-
-
## Set up dbt Semantic Layer
diff --git a/website/docs/reference/dbt-jinja-functions/execute.md b/website/docs/reference/dbt-jinja-functions/execute.md
index f99bfa64734..65cd4708dc8 100644
--- a/website/docs/reference/dbt-jinja-functions/execute.md
+++ b/website/docs/reference/dbt-jinja-functions/execute.md
@@ -9,7 +9,7 @@ description: "Use `execute` to return True when dbt is in 'execute' mode."
When you execute a `dbt compile` or `dbt run` command, dbt:
-1. Reads all of the files in your project and generates a "manifest" comprised of models, tests, and other graph nodes present in your project. During this phase, dbt uses the `ref` statements it finds to generate the DAG for your project. **No SQL is run during this phase**, and `execute == False`.
+1. Reads all of the files in your project and generates a [manifest](/reference/artifacts/manifest-json) comprised of models, tests, and other graph nodes present in your project. During this phase, dbt uses the [`ref`](/reference/dbt-jinja-functions/ref) and [`source`](/reference/dbt-jinja-functions/source) statements it finds to generate the DAG for your project. **No SQL is run during this phase**, and `execute == False`.
2. Compiles (and runs) each node (eg. building models, or running tests). **SQL is run during this phase**, and `execute == True`.
Any Jinja that relies on a result being returned from the database will error during the parse phase. For example, this SQL will return an error:
diff --git a/website/docs/reference/dbt-jinja-functions/model.md b/website/docs/reference/dbt-jinja-functions/model.md
index 903851617f2..516981e11e3 100644
--- a/website/docs/reference/dbt-jinja-functions/model.md
+++ b/website/docs/reference/dbt-jinja-functions/model.md
@@ -11,7 +11,7 @@ description: "`model` is the dbt graph object (or node) for the current model."
For example:
```jinja
-{% if model.config.materialization == 'view' %}
+{% if model.config.materialized == 'view' %}
{{ log(model.name ~ " is a view.", info=True) }}
{% endif %}
```
diff --git a/website/docs/reference/global-configs/about-global-configs.md b/website/docs/reference/global-configs/about-global-configs.md
index bbbe63ac439..3708b8c96be 100644
--- a/website/docs/reference/global-configs/about-global-configs.md
+++ b/website/docs/reference/global-configs/about-global-configs.md
@@ -16,7 +16,7 @@ There is a significant overlap between dbt's flags and dbt's command line option
### Setting flags
There are multiple ways of setting flags, which depend on the use case:
-- **[Project-level `flags` in `dbt_project.yml`](/reference/global-configs/project-flags):** Define version-controlled defaults for everyone running this project. Preserve [legacy behaviors](/reference/global-configs/legacy-behaviors) until their slated deprecation.
+- **[Project-level `flags` in `dbt_project.yml`](/reference/global-configs/project-flags):** Define version-controlled defaults for everyone running this project. Also, opt in or opt out of [behavior changes](/reference/global-configs/behavior-changes) to manage your migration off legacy functionality.
- **[Environment variables](/reference/global-configs/environment-variable-configs):** Define different behavior in different runtime environments (development vs. production vs. [continuous integration](/docs/deploy/continuous-integration), or different behavior for different users in development (based on personal preferences).
- **[CLI options](/reference/global-configs/command-line-options):** Define behavior specific to _this invocation_. Supported for all dbt commands.
@@ -41,7 +41,7 @@ dbt run --no-fail-fast # set to False
There are two categories of exceptions:
1. **Flags setting file paths:** Flags for file paths that are relevant to runtime execution (for example, `--log-path` or `--state`) cannot be set in `dbt_project.yml`. To override defaults, pass CLI options or set environment variables (`DBT_LOG_PATH`, `DBT_STATE`). Flags that tell dbt where to find project resources (for example, `model-paths`) are set in `dbt_project.yml`, but as a top-level key, outside the `flags` dictionary; these configs are expected to be fully static and never vary based on the command or execution environment.
-2. **Opt-in flags:** Flags opting into [legacy dbt behaviors](/reference/global-configs/legacy-behaviors) can _only_ be defined in `dbt_project.yml`. These are intended to be set in version control and migrated via pull/merge request. Their values should not diverge indefinitely across invocations, environments, or users.
+2. **Opt-in flags:** Flags opting in or out of [behavior changes](/reference/global-configs/behavior-changes) can _only_ be defined in `dbt_project.yml`. These are intended to be set in version control and migrated via pull/merge request. Their values should not diverge indefinitely across invocations, environments, or users.
### Accessing flags
@@ -84,7 +84,7 @@ Because the values of `flags` can differ across invocations, we strongly advise
| [quiet](/reference/global-configs/logs#suppress-non-error-logs-in-output) | boolean | False | ❌ | `DBT_QUIET` | `--quiet` | ✅ |
| [resource-type](/reference/global-configs/resource-type) (v1.8+) | string | None | ❌ | `DBT_RESOURCE_TYPES`
`DBT_EXCLUDE_RESOURCE_TYPES` | `--resource-type`
`--exclude-resource-type` | ✅ |
| [send_anonymous_usage_stats](/reference/global-configs/usage-stats) | boolean | True | ✅ | `DBT_SEND_ANONYMOUS_USAGE_STATS` | `--send-anonymous-usage-stats`, `--no-send-anonymous-usage-stats` | ❌ |
-| [source_freshness_run_project_hooks](/reference/global-configs/legacy-behaviors#source_freshness_run_project_hooks) | boolean | False | ✅ | ❌ | ❌ | ❌ |
+| [source_freshness_run_project_hooks](/reference/global-configs/behavior-changes#source_freshness_run_project_hooks) | boolean | False | ✅ | ❌ | ❌ | ❌ |
| [state](/reference/node-selection/defer) | path | none | ❌ | `DBT_STATE`, `DBT_DEFER_STATE` | `--state`, `--defer-state` | ❌ |
| [static_parser](/reference/global-configs/parsing#static-parser) | boolean | True | ✅ | `DBT_STATIC_PARSER` | `--static-parser`, `--no-static-parser` | ❌ |
| [store_failures](/reference/resource-configs/store_failures) | boolean | False | ✅ (as resource config) | `DBT_STORE_FAILURES` | `--store-failures`, `--no-store-failures` | ✅ |
diff --git a/website/docs/reference/global-configs/legacy-behaviors.md b/website/docs/reference/global-configs/behavior-changes.md
similarity index 75%
rename from website/docs/reference/global-configs/legacy-behaviors.md
rename to website/docs/reference/global-configs/behavior-changes.md
index 1450fda1459..20f5722b944 100644
--- a/website/docs/reference/global-configs/legacy-behaviors.md
+++ b/website/docs/reference/global-configs/behavior-changes.md
@@ -1,7 +1,7 @@
---
-title: "Legacy behaviors"
-id: "legacy-behaviors"
-sidebar: "Legacy behaviors"
+title: "Behavior changes"
+id: "behavior-changes"
+sidebar: "Behavior changes"
---
Most flags exist to configure runtime behaviors with multiple valid choices. The right choice may vary based on the environment, user preference, or the specific invocation.
@@ -12,10 +12,31 @@ Another category of flags provides existing projects with a migration window for
- Providing maintainability of dbt software. Every fork in behavior requires additional testing & cognitive overhead that slows future development. These flags exist to facilitate migration from "current" to "better," not to stick around forever.
These flags go through three phases of development:
-1. **Introduction (disabled by default):** dbt adds logic to support both 'old' + 'new' behaviors. The 'new' behavior is gated behind a flag, disabled by default, preserving the old behavior.
+1. **Introduction (disabled by default):** dbt adds logic to support both 'old' and 'new' behaviors. The 'new' behavior is gated behind a flag, disabled by default, preserving the old behavior.
2. **Maturity (enabled by default):** The default value of the flag is switched, from `false` to `true`, enabling the new behavior by default. Users can preserve the 'old' behavior and opt out of the 'new' behavior by setting the flag to `false` in their projects. They may see deprecation warnings when they do so.
3. **Removal (generally enabled):** After marking the flag for deprecation, we remove it along with the 'old' behavior it supported from the dbt codebases. We aim to support most flags indefinitely, but we're not committed to supporting them forever. If we choose to remove a flag, we'll offer significant advance notice.
+## What is a behavior change?
+
+The same dbt project code and the same dbt commands return one result before the behavior change, and they return a different result after the behavior change.
+
+Examples of behavior changes:
+- dbt begins raising a validation _error_ that it didn't previously.
+- dbt changes the signature of a built-in macro. Your project has a custom reimplementation of that macro. This could lead to errors, because your custom reimplementation will be passed arguments it cannot accept.
+- A dbt adapter renames or removes a method that was previously available on the `{{ adapter }}` object in the dbt-Jinja context.
+- dbt makes a breaking change to contracted metadata artifacts by deleting a required field, changing the name or type of an existing field, or removing the default value of an existing field ([README](https://github.com/dbt-labs/dbt-core/blob/37d382c8e768d1e72acd767e0afdcb1f0dc5e9c5/core/dbt/artifacts/README.md#breaking-changes)).
+- dbt removes one of the fields from [structured logs](/reference/events-logging#structured-logging).
+
+The following are **not** behavior changes:
+- Fixing a bug where the previous behavior was defective, undesirable, or undocumented.
+- dbt begins raising a _warning_ that it didn't previously.
+- dbt updates the language of human-friendly messages in log events.
+- dbt makes a non-breaking change to contracted metadata artifacts by adding a new field with a default, or deleting a field with a default ([README](https://github.com/dbt-labs/dbt-core/blob/37d382c8e768d1e72acd767e0afdcb1f0dc5e9c5/core/dbt/artifacts/README.md#non-breaking-changes)).
+
+The vast majority of changes are not behavior changes. Because introducing these changes does not require any action on the part of users, they are included in continuous releases of dbt Cloud and patch releases of dbt Core.
+
+By contrast, behavior change migrations happen slowly, over the course of months, facilitated by behavior change flags. The flags are loosely coupled to the specific dbt runtime version. By setting flags, users have control over opting in (and later opting out) of these changes.
+
## Behavior change flags
These flags _must_ be set in the `flags` dictionary in `dbt_project.yml`. They configure behaviors closely tied to project code, which means they should be defined in version control and modified through pull or merge requests, with the same testing and peer review.
diff --git a/website/docs/reference/global-configs/project-flags.md b/website/docs/reference/global-configs/project-flags.md
index 896276d9735..cdbe3463b14 100644
--- a/website/docs/reference/global-configs/project-flags.md
+++ b/website/docs/reference/global-configs/project-flags.md
@@ -17,7 +17,7 @@ flags:
Reference the [table of all flags](/reference/global-configs/about-global-configs#available-flags) to see which global configs are available for setting in [`dbt_project.yml`](/reference/dbt_project.yml).
-The `flags` dictionary is the _only_ place you can opt out of [behavior changes](/reference/global-configs/legacy-behaviors), while the legacy behavior is still supported.
+The `flags` dictionary is the _only_ place you can opt out of [behavior changes](/reference/global-configs/behavior-changes), while the legacy behavior is still supported.
diff --git a/website/docs/reference/resource-configs/athena-configs.md b/website/docs/reference/resource-configs/athena-configs.md
new file mode 100644
index 00000000000..f871ede9fab
--- /dev/null
+++ b/website/docs/reference/resource-configs/athena-configs.md
@@ -0,0 +1,552 @@
+---
+title: "Amazon Athena configurations"
+description: "Reference article for the Amazon Athena adapter for dbt Core and dbt Cloud."
+id: "athena-configs"
+---
+
+## Models
+
+### Table configuration
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `external_location` | None | The full S3 path to where the table is saved. It only works with incremental models. It doesn't work with Hive tables with `ha` set to `true`. |
+| `partitioned_by` | None | An array list of columns by which the table will be partitioned. Currently limited to 100 partitions. |
+| `bucketed_by` | None | An array list of the columns to bucket data. Ignored if using Iceberg. |
+| `bucket_count` | None | The number of buckets for bucketing your data. This parameter is ignored if using Iceberg. |
+| `table_type` | Hive | The type of table. Supports `hive` or `iceberg`. |
+| `ha` | False | Build the table using the high-availability method. Only available for Hive tables. |
+| `format` | Parquet | The data format for the table. Supports `ORC`, `PARQUET`, `AVRO`, `JSON`, and `TEXTFILE`. |
+| `write_compression` | None | The compression type for any storage format that allows compressions. |
+| `field_delimeter` | None | Specify the custom field delimiter to use when the format is set to `TEXTFIRE`. |
+| `table_properties` | N/A | The table properties to add to the table. This is only for Iceberg. |
+| `native_drop` | N/A | Relation drop operations will be performed with SQL, not direct Glue API calls. No S3 calls will be made to manage data in S3. Data in S3 will only be cleared up for Iceberg tables. See the [AWS docs](https://docs.aws.amazon.com/athena/latest/ug/querying-iceberg-managing-tables.html) for more info. Iceberg DROP TABLE operations may timeout if they take longer than 60 seconds.|
+| `seed_by_insert` | False | Creates seeds using an SQL insert statement. Large seed files can't exceed the Athena 262144 bytes limit. |
+| `force_batch` | False | Run the table creation directly in batch insert mode. Useful when the standard table creation fails due to partition limitation. |
+| `unique_tmp_table_suffix` | False | Replace the "__dbt_tmp table" suffix with a unique UUID for incremental models using insert overwrite on Hive tables. |
+| `temp_schema` | None | Defines a schema to hold temporary create statements used in incremental model runs. Scheme will be created in the models target database if it does not exist. |
+| `lf_tags_config` | None | [AWS Lake Formation](#aws-lake-formation-integration) tags to associate with the table and columns. Existing tags will be removed.
* `enabled` (`default=False`) whether LF tags management is enabled for a model
* `tags` dictionary with tags and their values to assign for the model
* `tags_columns` dictionary with a tag key, value and list of columns they must be assigned to |
+| `lf_inherited_tags` | None | List of the Lake Formation tag keys that are to be inherited from the database level and shouldn't be removed during the assignment of those defined in `ls_tags_config`. |
+| `lf_grants` | None | Lake Formation grants config for `data_cell` filters. |
+
+#### Configuration examples
+
+
+
+
+
+
+
+```sql
+{{
+ config(
+ materialized='incremental',
+ incremental_strategy='append',
+ on_schema_change='append_new_columns',
+ table_type='iceberg',
+ schema='test_schema',
+ lf_tags_config={
+ 'enabled': true,
+ 'tags': {
+ 'tag1': 'value1',
+ 'tag2': 'value2'
+ },
+ 'tags_columns': {
+ 'tag1': {
+ 'value1': ['column1', 'column2'],
+ 'value2': ['column3', 'column4']
+ }
+ },
+ 'inherited_tags': ['tag1', 'tag2']
+ }
+ )
+}}
+```
+
+
+
+
+
+
+
+
+```yaml
+ +lf_tags_config:
+ enabled: true
+ tags:
+ tag1: value1
+ tag2: value2
+ tags_columns:
+ tag1:
+ value1: [ column1, column2 ]
+ inherited_tags: [ tag1, tag2 ]
+```
+
+
+
+
+
+
+
+```python
+lf_grants={
+ 'data_cell_filters': {
+ 'enabled': True | False,
+ 'filters': {
+ 'filter_name': {
+ 'row_filter': '',
+ 'principals': ['principal_arn1', 'principal_arn2']
+ }
+ }
+ }
+ }
+```
+
+
+
+
+
+There are some limitations and recommendations that should be considered:
+
+- `lf_tags` and `lf_tags_columns` configs support only attaching lf tags to corresponding resources.
+- We recommend managing LF Tags permissions somewhere outside dbt. For example, [terraform](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lakeformation_permissions) or [aws cdk](https://docs.aws.amazon.com/cdk/api/v1/docs/aws-lakeformation-readme.html).
+- `data_cell_filters` management can't be automated outside dbt because the filter can't be attached to the table, which doesn't exist. Once you `enable` this config, dbt will set all filters and their permissions during every dbt run. Such an approach keeps the actual state of row-level security configuration after every dbt run and applies changes if they occur: drop, create, and update filters and their permissions.
+- Any tags listed in `lf_inherited_tags` should be strictly inherited from the database level and never overridden at the table and column level.
+- Currently, `dbt-athena` does not differentiate between an inherited tag association and an override it made previously.
+ - For example, If a `lf_tags_config` value overrides an inherited tag in one run, and that override is removed before a subsequent run, the prior override will linger and no longer be encoded anywhere (for example, Terraform where the inherited value is configured nor in the DBT project where the override previously existed but now is gone).
+
+
+### Table location
+
+The saved location of a table is determined in precedence by the following conditions:
+
+1. If `external_location` is defined, that value is used.
+2. If `s3_data_dir` is defined, the path is determined by that and `s3_data_naming`.
+3. If `s3_data_dir` is not defined, data is stored under `s3_staging_dir/tables/`.
+
+The following options are available for `s3_data_naming`:
+
+- `unique`: `{s3_data_dir}/{uuid4()}/`
+- `table`: `{s3_data_dir}/{table}/`
+- `table_unique`: `{s3_data_dir}/{table}/{uuid4()}/`
+- `schema_table`: `{s3_data_dir}/{schema}/{table}/`
+- `s3_data_naming=schema_table_unique`: `{s3_data_dir}/{schema}/{table}/{uuid4()}/`
+
+To set the `s3_data_naming` globally in the target profile, overwrite the value in the table config, or set up the value for groups of the models in dbt_project.yml.
+
+Note: If you're using a workgroup with a default output location configured, `s3_data_naming` ignores any configured buckets and uses the location configured in the workgroup.
+
+### Incremental models
+
+The following [incremental models](https://docs.getdbt.com/docs/build/incremental-models) strategies are supported:
+
+- `insert_overwrite` (default): The insert-overwrite strategy deletes the overlapping partitions from the destination table and then inserts the new records from the source. This strategy depends on the `partitioned_by` keyword! dbt will fall back to the `append` strategy if no partitions are defined.
+- `append`: Insert new records without updating, deleting or overwriting any existing data. There might be duplicate data (great for log or historical data).
+- `merge`: Conditionally updates, deletes, or inserts rows into an Iceberg table. Used in combination with `unique_key`.It is only available when using Iceberg.
+
+
+### On schema change
+
+The `on_schema_change` option reflects changes of the schema in incremental models. The values you can set this to are:
+
+- `ignore` (default)
+- `fail`
+- `append_new_columns`
+- `sync_all_columns`
+
+To learn more, refer to [What if the columns of my incremental model change](/docs/build/incremental-models#what-if-the-columns-of-my-incremental-model-change).
+
+### Iceberg
+
+The adapter supports table materialization for Iceberg.
+
+For example:
+
+```sql
+{{ config(
+ materialized='table',
+ table_type='iceberg',
+ format='parquet',
+ partitioned_by=['bucket(user_id, 5)'],
+ table_properties={
+ 'optimize_rewrite_delete_file_threshold': '2'
+ }
+) }}
+
+select 'A' as user_id,
+ 'pi' as name,
+ 'active' as status,
+ 17.89 as cost,
+ 1 as quantity,
+ 100000000 as quantity_big,
+ current_date as my_date
+```
+
+Iceberg supports bucketing as hidden partitions. Use the `partitioned_by` config to add specific bucketing
+conditions.
+
+Iceberg supports the `PARQUET`, `AVRO` and `ORC` table formats for data .
+
+The following are the supported strategies for using Iceberg incrementally:
+
+- `append`: New records are appended to the table (this can lead to duplicates).
+- `merge`: Perform an update and insert (and optional delete) where new and existing records are added. This is only available with Athena engine version 3.
+ - `unique_key`(required): Columns that define a unique source and target table record.
+ - `incremental_predicates` (optional): The SQL conditions that enable custom join clauses in the merge statement. This helps improve performance via predicate pushdown on target tables.
+ - `delete_condition` (optional): SQL condition that identifies records that should be deleted.
+ - `update_condition` (optional): SQL condition that identifies records that should be updated.
+ - `insert_condition` (optional): SQL condition that identifies records that should be inserted.
+
+`incremental_predicates`, `delete_condition`, `update_condition` and `insert_condition` can include any column of the incremental table (`src`) or the final table (`target`). Column names must be prefixed by either `src` or `target` to prevent a `Column is ambiguous` error.
+
+
+
+
+
+```sql
+{{ config(
+ materialized='incremental',
+ table_type='iceberg',
+ incremental_strategy='merge',
+ unique_key='user_id',
+ incremental_predicates=["src.quantity > 1", "target.my_date >= now() - interval '4' year"],
+ delete_condition="src.status != 'active' and target.my_date < now() - interval '2' year",
+ format='parquet'
+) }}
+
+select 'A' as user_id,
+ 'pi' as name,
+ 'active' as status,
+ 17.89 as cost,
+ 1 as quantity,
+ 100000000 as quantity_big,
+ current_date as my_date
+```
+
+
+
+
+
+```sql
+{{ config(
+ materialized='incremental',
+ incremental_strategy='merge',
+ unique_key=['id'],
+ update_condition='target.id > 1',
+ schema='sandbox'
+ )
+}}
+
+{% if is_incremental() %}
+
+select * from (
+ values
+ (1, 'v1-updated')
+ , (2, 'v2-updated')
+) as t (id, value)
+
+{% else %}
+
+select * from (
+ values
+ (-1, 'v-1')
+ , (0, 'v0')
+ , (1, 'v1')
+ , (2, 'v2')
+) as t (id, value)
+
+{% endif %}
+```
+
+
+
+
+
+```sql
+{{ config(
+ materialized='incremental',
+ incremental_strategy='merge',
+ unique_key=['id'],
+ insert_condition='target.status != 0',
+ schema='sandbox'
+ )
+}}
+
+select * from (
+ values
+ (1, 0)
+ , (2, 1)
+) as t (id, status)
+
+```
+
+
+
+
+
+### High availability (HA) table
+
+The current implementation of table materialization can lead to downtime, as the target table is dropped and re-created. For less destructive behavior, you can use the `ha` config on your `table` materialized models. It leverages the table versions feature of the glue catalog, which creates a temporary table and swaps the target table to the location of the temporary table. This materialization is only available for `table_type=hive` and requires using unique locations. For Iceberg, high availability is the default.
+
+By default, the materialization keeps the last 4 table versions,but you can change it by setting `versions_to_keep`.
+
+```sql
+{{ config(
+ materialized='table',
+ ha=true,
+ format='parquet',
+ table_type='hive',
+ partitioned_by=['status'],
+ s3_data_naming='table_unique'
+) }}
+
+select 'a' as user_id,
+ 'pi' as user_name,
+ 'active' as status
+union all
+select 'b' as user_id,
+ 'sh' as user_name,
+ 'disabled' as status
+```
+
+
+#### HA known issues
+
+- There could be a little downtime when swapping from a table with partitions to a table without (and the other way around). If higher performance is needed, consider bucketing instead of partitions.
+- By default, Glue "duplicates" the versions internally, so the last two versions of a table point to the same location.
+- It's recommended to set `versions_to_keep` >= 4, as this will avoid having the older location removed.
+
+### Update glue data catalog
+
+You can persist your column and model level descriptions to the Glue Data Catalog as [glue table properties](https://docs.aws.amazon.com/glue/latest/dg/tables-described.html#table-properties) and [column parameters](https://docs.aws.amazon.com/glue/latest/webapi/API_Column.html). To enable this, set the configuration to `true` as shown in the following example. By default, documentation persistence is disabled, but it can be enabled for specific resources or groups of resources as needed.
+
+
+For example:
+
+```yaml
+models:
+ - name: test_deduplicate
+ description: another value
+ config:
+ persist_docs:
+ relation: true
+ columns: true
+ meta:
+ test: value
+ columns:
+ - name: id
+ meta:
+ primary_key: true
+```
+
+Refer to [persist_docs](https://docs.getdbt.com/reference/resource-configs/persist_docs) for more details.
+
+## Snapshots
+
+The adapter supports snapshot materialization. It supports both the timestamp and check strategies. To create a snapshot, create a snapshot file in the `snapshots` directory. You'll need to create this directory if it doesn't already exist.
+
+### Timestamp strategy
+
+
+Refer to [Timestamp strategy](/docs/build/snapshots#timestamp-strategy-recommended) for details on how to use it.
+
+
+### Check strategy
+
+Refer to [Check strategy](/docs/build/snapshots#check-strategy) for details on how to use it.
+
+### Hard deletes
+
+The materialization also supports invalidating hard deletes. For usage details, refer to [Hard deletes](/docs/build/snapshots#hard-deletes-opt-in).
+
+### Snapshots known issues
+
+- Incremental Iceberg models - Sync all columns on schema change. Columns used for partitioning can't be removed. From a dbt perspective, the only way is to fully refresh the incremental model.
+- Tables, schemas and database names should only be lowercase
+- To avoid potential conflicts, make sure [`dbt-athena-adapter`](https://github.com/Tomme/dbt-athena) is not installed in the target environment.
+- Snapshot does not support dropping columns from the source table. If you drop a column, make sure to drop the column from the snapshot as well. Another workaround is to NULL the column in the snapshot definition to preserve the history.
+
+## AWS Lake Formation integration
+
+The following describes how the adapter implements the AWS Lake Formation tag management:
+
+- [Enable](#table-configuration) LF tags management with the `lf_tags_config` parameter. By default, it's disabled.
+- Once enabled, LF tags are updated on every dbt run.
+- First, all lf-tags for columns are removed to avoid inheritance issues.
+- Then, all redundant lf-tags are removed from tables and actual tags from table configs are applied.
+- Finally, lf-tags for columns are applied.
+
+It's important to understand the following points:
+
+- dbt doesn't manage `lf-tags` for databases
+- dbt doesn't manage Lake Formation permissions
+
+That's why it's important to take care of this yourself or use an automation tool such as terraform and AWS CDK. For more details, refer to:
+
+* [terraform aws_lakeformation_permissions](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lakeformation_permissions)
+* [terraform aws_lakeformation_resource_lf_tags](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lakeformation_resource_lf_tags)
+
+## Python models
+
+The adapter supports Python models using [`spark`](https://docs.aws.amazon.com/athena/latest/ug/notebooks-spark.html).
+
+### Prerequisites
+
+- A Spark-enabled workgroup created in Athena.
+- Spark execution role granted access to Athena, Glue and S3.
+- The Spark workgroup is added to the `~/.dbt/profiles.yml` file and the profile to be used
+ is referenced in `dbt_project.yml`.
+
+### Spark-specific table configuration
+
+| Configuration | Default | Description |
+|---------------|---------|--------------|
+| `timeout` | 43200 | Time out in seconds for each Python model execution. Defaults to 12 hours/43200 seconds. |
+| `spark_encryption` | False | When set to `true,` it encrypts data stored locally by Spark and in transit between Spark nodes. |
+| `spark_cross_account_catalog` | False | When using the Spark Athena workgroup, queries can only be made against catalogs on the same AWS account by default. Setting this parameter to true will enable querying external catalogs if you want to query another catalog on an external AWS account.
Use the syntax `external_catalog_id/database.table` to access the external table on the external catalog (For example, `999999999999/mydatabase.cloudfront_logs` where 999999999999 is the external catalog ID).|
+| `spark_requester_pays` | False | When set to true, if an Amazon S3 bucket is configured as `requester pays`, the user account running the query is charged for data access and data transfer fees associated with the query. |
+
+
+### Spark notes
+
+- A session is created for each unique engine configuration defined in the models that are part of the invocation.
+A session's idle timeout is set to 10 minutes. Within the timeout period, if a new calculation (Spark Python model) is ready for execution and the engine configuration matches, the process will reuse the same session.
+- The number of Python models running simultaneously depends on the `threads`. The number of sessions created for the entire run depends on the number of unique engine configurations and the availability of sessions to maintain thread concurrency.
+- For Iceberg tables, it's recommended to use the `table_properties` configuration to set the `format_version` to `2`. This helps maintain compatibility between the Iceberg tables Trino created and those Spark created.
+
+### Example models
+
+
+
+
+
+```python
+import pandas as pd
+
+
+def model(dbt, session):
+ dbt.config(materialized="table")
+
+ model_df = pd.DataFrame({"A": [1, 2, 3, 4]})
+
+ return model_df
+```
+
+
+
+
+
+```python
+def model(dbt, spark_session):
+ dbt.config(materialized="table")
+
+ data = [(1,), (2,), (3,), (4,)]
+
+ df = spark_session.createDataFrame(data, ["A"])
+
+ return df
+```
+
+
+
+
+```python
+def model(dbt, spark_session):
+ dbt.config(materialized="incremental")
+ df = dbt.ref("model")
+
+ if dbt.is_incremental:
+ max_from_this = (
+ f"select max(run_date) from {dbt.this.schema}.{dbt.this.identifier}"
+ )
+ df = df.filter(df.run_date >= spark_session.sql(max_from_this).collect()[0][0])
+
+ return df
+```
+
+
+
+
+
+```python
+def model(dbt, spark_session):
+ dbt.config(
+ materialized="table",
+ engine_config={
+ "CoordinatorDpuSize": 1,
+ "MaxConcurrentDpus": 3,
+ "DefaultExecutorDpuSize": 1
+ },
+ spark_encryption=True,
+ spark_cross_account_catalog=True,
+ spark_requester_pays=True
+ polling_interval=15,
+ timeout=120,
+ )
+
+ data = [(1,), (2,), (3,), (4,)]
+
+ df = spark_session.createDataFrame(data, ["A"])
+
+ return df
+```
+
+
+
+
+
+Using imported external python files:
+
+```python
+def model(dbt, spark_session):
+ dbt.config(
+ materialized="incremental",
+ incremental_strategy="merge",
+ unique_key="num",
+ )
+ sc = spark_session.sparkContext
+ sc.addPyFile("s3://athena-dbt/test/file1.py")
+ sc.addPyFile("s3://athena-dbt/test/file2.py")
+
+ def func(iterator):
+ from file2 import transform
+
+ return [transform(i) for i in iterator]
+
+ from pyspark.sql.functions import udf
+ from pyspark.sql.functions import col
+
+ udf_with_import = udf(func)
+
+ data = [(1, "a"), (2, "b"), (3, "c")]
+ cols = ["num", "alpha"]
+ df = spark_session.createDataFrame(data, cols)
+
+ return df.withColumn("udf_test_col", udf_with_import(col("alpha")))
+```
+
+
+
+
+
+### Known issues in Python models
+
+- Python models can't [reference Athena SQL views](https://docs.aws.amazon.com/athena/latest/ug/notebooks-spark.html).
+- You can use third-party Python libraries; however, they must be [included in the pre-installed list][pre-installed list] or [imported manually][imported manually].
+- Python models can only reference or write to tables with names matching the regular expression: `^[0-9a-zA-Z_]+$`. Spark doesn't support dashes or special characters, even though Athena supports them.
+- Incremental models don't fully utilize Spark capabilities. They depend partially on existing SQL-based logic that runs on Trino.
+- Snapshot materializations are not supported.
+- Spark can only reference tables within the same catalog.
+- For tables created outside of the dbt tool, be sure to populate the location field, or dbt will throw an error when creating the table.
+
+
+[pre-installed list]: https://docs.aws.amazon.com/athena/latest/ug/notebooks-spark-preinstalled-python-libraries.html
+[imported manually]: https://docs.aws.amazon.com/athena/latest/ug/notebooks-import-files-libraries.html
+
+## Contracts
+
+The adapter partly supports contract definitions:
+
+- `data_type` is supported but needs to be adjusted for complex types. Types must be specified entirely (for example, `array`) even though they won't be checked. Indeed, as dbt recommends, we only compare the broader type (array, map, int, varchar). The complete definition is used to check that the data types defined in Athena are ok (pre-flight check).
+- The adapter does not support the constraints since Athena has no constraint concept.
+
diff --git a/website/docs/reference/resource-configs/databricks-configs.md b/website/docs/reference/resource-configs/databricks-configs.md
index de1bb075015..5823fe7d9a4 100644
--- a/website/docs/reference/resource-configs/databricks-configs.md
+++ b/website/docs/reference/resource-configs/databricks-configs.md
@@ -410,31 +410,31 @@ To take advantage of this capability, you will need to add compute blocks to you
```yaml
-:
- target: # this is the default target
+profile-name:
+ target: target-name # this is the default target
outputs:
- :
+ target-name:
type: databricks
- catalog: [optional catalog name if you are using Unity Catalog]
- schema: [schema name] # Required
- host: [yourorg.databrickshost.com] # Required
+ catalog: optional catalog name if you are using Unity Catalog
+ schema: schema name # Required
+ host: yourorg.databrickshost.com # Required
### This path is used as the default compute
- http_path: [/sql/your/http/path] # Required
+ http_path: /sql/your/http/path # Required
### New compute section
compute:
### Name that you will use to refer to an alternate compute
Compute1:
- http_path: [‘/sql/your/http/path’] # Required of each alternate compute
+ http_path: '/sql/your/http/path' # Required of each alternate compute
### A third named compute, use whatever name you like
Compute2:
- http_path: [‘/some/other/path’] # Required of each alternate compute
+ http_path: '/some/other/path' # Required of each alternate compute
...
- : # additional targets
+ target-name: # additional targets
...
### For each target, you need to define the same compute,
### but you can specify different paths
@@ -442,11 +442,11 @@ To take advantage of this capability, you will need to add compute blocks to you
### Name that you will use to refer to an alternate compute
Compute1:
- http_path: [‘/sql/your/http/path’] # Required of each alternate compute
+ http_path: '/sql/your/http/path' # Required of each alternate compute
### A third named compute, use whatever name you like
Compute2:
- http_path: [‘/some/other/path’] # Required of each alternate compute
+ http_path: '/some/other/path' # Required of each alternate compute
...
```
diff --git a/website/docs/reference/resource-configs/pre-hook-post-hook.md b/website/docs/reference/resource-configs/pre-hook-post-hook.md
index bf4375c9490..e1e7d67f02e 100644
--- a/website/docs/reference/resource-configs/pre-hook-post-hook.md
+++ b/website/docs/reference/resource-configs/pre-hook-post-hook.md
@@ -45,6 +45,18 @@ select ...
```
+
+
+
+
+```yml
+models:
+ - name: []
+ config:
+ [pre_hook](/reference/resource-configs/pre-hook-post-hook): | []
+ [post_hook](/reference/resource-configs/pre-hook-post-hook): | []
+```
+
@@ -66,6 +78,18 @@ seeds:
+
+
+```yml
+seeds:
+ - name: []
+ config:
+ [pre_hook](/reference/resource-configs/pre-hook-post-hook): | []
+ [post_hook](/reference/resource-configs/pre-hook-post-hook): | []
+```
+
+
+
@@ -102,6 +126,18 @@ select ...