From a4c9f11095c1291224e41bc9177db4cca1f01300 Mon Sep 17 00:00:00 2001 From: Jean Cochrane Date: Tue, 6 Aug 2024 15:27:52 -0500 Subject: [PATCH 1/8] Update docs to clarify the way aliases are used in CTEs (#5795) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## What are you changing in this pull request and why? This PR updates the docs to reflect the bugfix made in https://github.com/dbt-labs/dbt-core/pull/10290 and https://github.com/dbt-labs/dbt-adapters/pull/236 to bring the identifiers used for CTEs in line with the identifiers used for tables and views. ⚠️ Note that these changes should **not** be deployed until those two PRs have been merged and released, since these docs will not accurately reflect the behavior of the packages until they have been patched. I've included this prerequisite in the checklist below. ## Checklist - [x] Review the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) so my content adheres to these guidelines. - [x] For [docs versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#about-versioning), review how to [version a whole page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) and [version a block of content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content). - [x] Add a checklist item for anything that needs to happen before this PR is merged, such as "needs technical review" or "change base branch." - [ ] Merge and release https://github.com/dbt-labs/dbt-core/pull/10290 and https://github.com/dbt-labs/dbt-adapters/pull/236 --------- Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- website/docs/docs/build/custom-aliases.md | 18 +++++++++++------- website/docs/docs/build/materializations.md | 3 ++- .../docs/reference/resource-configs/alias.md | 2 ++ 3 files changed, 15 insertions(+), 8 deletions(-) diff --git a/website/docs/docs/build/custom-aliases.md b/website/docs/docs/build/custom-aliases.md index ee56480dc5c..4f22de63e3f 100644 --- a/website/docs/docs/build/custom-aliases.md +++ b/website/docs/docs/build/custom-aliases.md @@ -6,18 +6,22 @@ id: "custom-aliases" ## Overview -When dbt runs a model, it will generally create a relation (either a `table` or a `view`) in the database. By default, dbt uses the filename of the model as the identifier for this relation in the database. This identifier can optionally be overridden using the [`alias`](/reference/resource-configs/alias) model configuration. +When dbt runs a model, it will generally create a relation (either a or a ) in the database, except in the case of an [ephemeral model](/docs/build/materializations), when it will create a for use in another model. By default, dbt uses the model's filename as the identifier for the relation or CTE it creates. This identifier can be overridden using the [`alias`](/reference/resource-configs/alias) model configuration. ### Why alias model names? The names of schemas and tables are effectively the "user interface" of your . Well-named schemas and tables can help provide clarity and direction for consumers of this data. In combination with [custom schemas](/docs/build/custom-schemas), model aliasing is a powerful mechanism for designing your warehouse. -### Usage -The `alias` config can be used to change the name of a model's identifier in the database. The following shows examples of database identifiers for models both with, and without, a supplied `alias`. +The file naming scheme that you use to organize your models may also interfere with your data platform's requirements for identifiers. For example, you might wish to namespace your files using a period (`.`), but your data platform's SQL dialect may interpret periods to indicate a separation between schema names and table names in identifiers, or it may forbid periods from being used at all in CTE identifiers. In cases like these, model aliasing can allow you to retain flexibility in the way you name your model files without violating your data platform's identifier requirements. -| Model | Config | Database Identifier | -| ----- | ------ | ------------------- | -| ga_sessions.sql | <None> | "analytics"."ga_sessions" | -| ga_sessions.sql | {{ config(alias='sessions') }} | "analytics"."sessions" | +### Usage +The `alias` config can be used to change the name of a model's identifier in the database. The following table shows examples of database identifiers for models both with and without a supplied `alias`, and with different materializations. + +| Model | Config | Relation Type | Database Identifier | +| ----- | ------ | --------------| ------------------- | +| ga_sessions.sql | {{ config(materialization='view') }} | | "analytics"."ga_sessions" | +| ga_sessions.sql | {{ config(materialization='view', alias='sessions') }} | | "analytics"."sessions" | +| ga_sessions.sql | {{ config(materialization='ephemeral') }} | | "\__dbt\__cte\__ga_sessions" | +| ga_sessions.sql | {{ config(materialization='ephemeral', alias='sessions') }} | | "\__dbt\__cte\__sessions" | To configure an alias for a model, supply a value for the model's `alias` configuration parameter. For example: diff --git a/website/docs/docs/build/materializations.md b/website/docs/docs/build/materializations.md index eb150a2b20c..5deb1e7ce92 100644 --- a/website/docs/docs/build/materializations.md +++ b/website/docs/docs/build/materializations.md @@ -94,7 +94,8 @@ When using the `table` materialization, your model is rebuilt as a expression. +`ephemeral` models are not directly built into the database. Instead, dbt will interpolate the code from an ephemeral model into its dependent models using a common table expression (). You can control the identifier for this CTE using a [model alias](/docs/build/custom-aliases), but dbt will always prefix the model identifier with `__dbt__cte__`. + * **Pros:** * You can still write reusable logic - Ephemeral models can help keep your clean by reducing clutter (also consider splitting your models across multiple schemas by [using custom schemas](/docs/build/custom-schemas)). diff --git a/website/docs/reference/resource-configs/alias.md b/website/docs/reference/resource-configs/alias.md index 6cb14371dfa..3f36bbd0d8f 100644 --- a/website/docs/reference/resource-configs/alias.md +++ b/website/docs/reference/resource-configs/alias.md @@ -116,5 +116,7 @@ The standard behavior of dbt is: * If a custom alias is _not_ specified, the identifier of the relation is the resource name (i.e. the filename). * If a custom alias is specified, the identifier of the relation is the `{{ alias }}` value. +**Note** With an [ephemeral model](/docs/build/materializations), dbt will always apply the prefix `__dbt__cte__` to the identifier. This means that if an alias is set on an ephemeral model, then its CTE identifier will be `__dbt__cte__{{ alias }}`, but if no alias is set then its identifier will be `__dbt__cte__{{ filename }}`. + To learn more about changing the way that dbt generates a relation's `identifier`, read [Using Aliases](/docs/build/custom-aliases). From bcc9b8bf45babd28e66fa38e3d3384cfb32e2d11 Mon Sep 17 00:00:00 2001 From: ialdg <39755524+ialdg@users.noreply.github.com> Date: Tue, 6 Aug 2024 22:42:21 +0200 Subject: [PATCH 2/8] Update schema.md (#5383) Hi. This modification proposal is intended to correct a possible mistake, in regard to how the full qualified name of a seed would be built. I think there's a mistake since the present name uses the name of the subfolder as the seed model name. Regards. IL. ## What are you changing in this pull request and why? ## Checklist - [x] Review the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) so my content adheres to these guidelines. - [ ] For [docs versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#about-versioning), review how to [version a whole page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) and [version a block of content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content). - [ ] Add a checklist item for anything that needs to happen before this PR is merged, such as "needs technical review" or "change base branch." Adding or removing pages (delete if not applicable): - [ ] Add/remove page in `website/sidebars.js` - [ ] Provide a unique filename for new pages - [ ] Add an entry for deleted pages in `website/vercel.json` - [ ] Run link testing locally with `npm run build` to update the links that point to deleted pages Co-authored-by: Leona B. Campbell <3880403+runleonarun@users.noreply.github.com> Co-authored-by: Mirna Wong <89008547+mirnawong1@users.noreply.github.com> Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- website/docs/reference/resource-configs/schema.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/website/docs/reference/resource-configs/schema.md b/website/docs/reference/resource-configs/schema.md index 5a1a61d3943..57a357767cb 100644 --- a/website/docs/reference/resource-configs/schema.md +++ b/website/docs/reference/resource-configs/schema.md @@ -41,7 +41,8 @@ seeds: +schema: mappings ``` -This would result in the generated relation being located in the `mappings` schema, so the full relation name would be `analytics.target_schema_mappings.product_mappings`. +This would result in the generated relation being located in the `mappings` schema, so the full relation name would be `analytics.mappings.seed_name`. + From 08d74a3176ed350688af7e34eb83ac2be8b4ed54 Mon Sep 17 00:00:00 2001 From: Bart Schuijt Date: Tue, 6 Aug 2024 23:21:13 +0200 Subject: [PATCH 3/8] Update warnings.md (#5693) ## What are you changing in this pull request and why? Small clarification that values should be passed on as arrays. Previously (dbt 1.7.x) this worked but from 1.8.x. this no longer seems to be the case. ## Checklist - [ ] Review the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) so my content adheres to these guidelines. - [ ] For [docs versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#about-versioning), review how to [version a whole page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) and [version a block of content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content). - [ ] Add a checklist item for anything that needs to happen before this PR is merged, such as "needs technical review" or "change base branch." Adding or removing pages (delete if not applicable): - [ ] Add/remove page in `website/sidebars.js` - [ ] Provide a unique filename for new pages - [ ] Add an entry for deleted pages in `website/vercel.json` - [ ] Run link testing locally with `npm run build` to update the links that point to deleted pages --------- Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- website/docs/reference/global-configs/warnings.md | 1 + 1 file changed, 1 insertion(+) diff --git a/website/docs/reference/global-configs/warnings.md b/website/docs/reference/global-configs/warnings.md index 0cb4add5f0d..97eb270338e 100644 --- a/website/docs/reference/global-configs/warnings.md +++ b/website/docs/reference/global-configs/warnings.md @@ -82,6 +82,7 @@ DBT_WARN_ERROR_OPTIONS='{"include": ["NoNodesForSelectionCriteria"]}' dbt run ... ``` +Values for `error`, `warn`, and/or `silence` should be passed on as arrays. For example, `dbt --warn-error-options '{"error": "all", "warn": ["NoNodesForSelectionCriteria"]}' run` not `dbt --warn-error-options '{"error": "all", "warn": "NoNodesForSelectionCriteria"}' run`. From 57ac634122039e2d1602cf7e8261c915e9a8ad67 Mon Sep 17 00:00:00 2001 From: bethanyhipple-dbtlabs <108838013+bethanyhipple-dbtlabs@users.noreply.github.com> Date: Tue, 6 Aug 2024 14:30:31 -0700 Subject: [PATCH 4/8] Change where clause in unique_key example (#5897) ## What are you changing in this pull request and why? The max statement in the where clause for the unique_key example referred to event_time, but event_time isn't in the select statement. So this should be updated to max(date_day). ## Checklist - [X ] Review the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) so my content adheres to these guidelines. - [ X] For [docs versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#about-versioning), review how to [version a whole page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) and [version a block of content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content). - [X ] Add a checklist item for anything that needs to happen before this PR is merged, such as "needs technical review" or "change base branch." Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- website/docs/docs/build/incremental-models.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/build/incremental-models.md b/website/docs/docs/build/incremental-models.md index 21cd656484a..df95504ceab 100644 --- a/website/docs/docs/build/incremental-models.md +++ b/website/docs/docs/build/incremental-models.md @@ -142,7 +142,7 @@ from {{ ref('app_data_events') }} -- this filter will only be applied on an incremental run -- (uses >= to include records arriving later on the same day as the last run of this model) - where date_day >= (select coalesce(max(event_time), '1900-01-01') from {{ this }}) + where date_day >= (select coalesce(max(date_day), '1900-01-01') from {{ this }}) {% endif %} From 19607529038c8a41f03053c9c5f351f432678109 Mon Sep 17 00:00:00 2001 From: Tania <92768464+Tonayya@users.noreply.github.com> Date: Wed, 7 Aug 2024 19:20:47 +1000 Subject: [PATCH 5/8] Update ci-jobs.md (#5899) ## What are you changing in this pull request and why? One of our customers shared this example and wanted to add this to the troubleshooting segment for any other customers that may experience the same issue. [ZD ticket](https://dbtcloud.zendesk.com/agent/tickets/71658) ## Checklist - [x] Review the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) so my content adheres to these guidelines. --------- Co-authored-by: Mirna Wong <89008547+mirnawong1@users.noreply.github.com> --- website/docs/docs/deploy/ci-jobs.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/website/docs/docs/deploy/ci-jobs.md b/website/docs/docs/deploy/ci-jobs.md index a96311a850f..bd4b117d586 100644 --- a/website/docs/docs/deploy/ci-jobs.md +++ b/website/docs/docs/deploy/ci-jobs.md @@ -216,7 +216,11 @@ If you're on a Virtual Private dbt Enterprise plan using security features like When you start a CI job, the pull request status should show as `pending` while it waits for an update from dbt. Once the CI job finishes, dbt sends the status to Azure DevOps (ADO), and the status will change to either `succeeded` or `failed`. -If the status doesn't get updated after the job runs, check if there are any git branch policies in place that's blocking ADO from receiving these updates. You can find relevant information here: +If the status doesn't get updated after the job runs, check if there are any git branch policies in place blocking ADO from receiving these updates. + +One potential issue is the **Reset conditions** under **Status checks** in the ADO repository branch policy. If you enable the **Reset status whenever there are new changes** checkbox (under **Reset conditions**), it can prevent dbt from updating ADO about your CI job run status. +You can find relevant information here: +- [Azure DevOps Services Status checks](https://learn.microsoft.com/en-us/azure/devops/repos/git/branch-policies?view=azure-devops&tabs=browser#status-checks) - [Azure DevOps Services Pull Request Stuck Waiting on Status Update](https://support.hashicorp.com/hc/en-us/articles/18670331556627-Azure-DevOps-Services-Pull-Request-Stuck-Waiting-on-Status-Update-from-Terraform-Cloud-Enterprise-Run) - [Pull request status](https://learn.microsoft.com/en-us/azure/devops/repos/git/pull-request-status?view=azure-devops#pull-request-status) From 966ad09c1854322b2f1f6dd7013c17f1bd4fa8dd Mon Sep 17 00:00:00 2001 From: Petro Tiurin <93913847+ptiurin@users.noreply.github.com> Date: Wed, 7 Aug 2024 11:57:29 +0100 Subject: [PATCH 6/8] Update Firebolt features and connection parameters (#5868) ## What are you changing in this pull request and why? Firebolt has rolled out a new authentication method with service account credentials, all new Firebolt DBT users would use it from now on. Existing dbt-firebolt connector handles this behind the scenes allowing users that haven't migrated yet to use their old credentials while also allowing new customers to use the recommended way of authentication. This change in documentation reflects the new auth to avoid confusion for the new users. Also updating some of the outdated information on Firebolt features supported, like removal of Join indexes and engine types and updating doc links to the new website. ## Checklist - [x] Review the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) so my content adheres to these guidelines. - [x] For [docs versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#about-versioning), review how to [version a whole page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) and [version a block of content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content). Adding or removing pages (delete if not applicable): - [x] Run link testing locally with `npm run build` to update the links that point to deleted pages --- .../connect-data-platform/firebolt-setup.md | 26 +++---- .../resource-configs/firebolt-configs.md | 68 +++++-------------- 2 files changed, 29 insertions(+), 65 deletions(-) diff --git a/website/docs/docs/core/connect-data-platform/firebolt-setup.md b/website/docs/docs/core/connect-data-platform/firebolt-setup.md index 8fb91dea299..b1695c75b37 100644 --- a/website/docs/docs/core/connect-data-platform/firebolt-setup.md +++ b/website/docs/docs/core/connect-data-platform/firebolt-setup.md @@ -40,16 +40,15 @@ To connect to Firebolt from dbt, you'll need to add a [profile](https://docs.get outputs: : type: firebolt - user: "" - password: "" + client_id: "" + client_secret: "" database: "" engine_name: "" + account_name: "" schema: threads: 1 #optional fields - jar_path: host: "" - account_name: "" ``` @@ -57,30 +56,27 @@ To connect to Firebolt from dbt, you'll need to add a [profile](https://docs.get #### Description of Firebolt Profile Fields -To specify values as environment variables, use the format `{{ env_var('' }}`. For example, `{{ env_var('DATABASE_NAME' }}`. +To specify values as environment variables, use the format `{{ env_var('' }}`. For example, `{{ env_var('DATABASE_NAME' }}`. | Field | Description | |--------------------------|--------------------------------------------------------------------------------------------------------| | `type` | This must be included either in `profiles.yml` or in the `dbt_project.yml` file. Must be set to `firebolt`. | -| `user` | Required. A Firebolt username with adequate permissions to access the specified `engine_name`. | -| `password` | Required. The password associated with the specified `user`. | +| `client_id` | Required. Your [service account](https://docs.firebolt.io/godocs/Guides/managing-your-organization/service-accounts.html) id. | +| `client_secret` | Required. The secret associated with the specified `client_id`. | | `database` | Required. The name of the Firebolt database to connect to. | | `engine_name` | Required in version 0.21.10 and later. Optional in earlier versions. The name (not the URL) of the Firebolt engine to use in the specified `database`. This must be a general purpose read-write engine and the engine must be running. If omitted in earlier versions, the default engine for the specified `database` is used. | +| `account_name` | Required. Specifies the account name under which the specified `database` exists. | | `schema` | Recommended. A string to add as a prefix to the names of generated tables when using the [custom schemas workaround](https://docs.getdbt.com/reference/warehouse-profiles/firebolt-profile#supporting-concurrent-development). | -| `threads` | Required. Must be set to `1`. Multi-threading is not currently supported. | -| `jar_path` | Required only with versions earlier than 0.21.0. Ignored in 0.21.0 and later. The path to your JDBC driver on your local drive. | +| `threads` | Required. Set to higher number to improve performance. | | `host` | Optional. The host name of the connection. For all customers it is `api.app.firebolt.io`, which will be used if omitted. | -| `account_name` | Required if more than one account is associated with the specified `user1`. Specifies the account name (not the account ID) under which the specified `database` exists. If omitted, the default account is assumed. | - + #### Troubleshooting Connections If you encounter issues connecting to Firebolt from dbt, make sure the following criteria are met: -- The engine must be a general-purpose read-write engine, not an analytics engine. -- You must have adequate permissions to access the engine. +- You must have adequate permissions to access the engine and the database. +- Your service account must be attached to a user. - The engine must be running. -- If you're not using the default engine for the database, you must specify an engine name. -- If there is more than one account associated with your credentials, you must specify an account. ## Supporting Concurrent Development diff --git a/website/docs/reference/resource-configs/firebolt-configs.md b/website/docs/reference/resource-configs/firebolt-configs.md index 59adee715ba..58fc0e2a319 100644 --- a/website/docs/reference/resource-configs/firebolt-configs.md +++ b/website/docs/reference/resource-configs/firebolt-configs.md @@ -58,7 +58,7 @@ models: table_type: fact primary_index: [ , ... ] indexes: - - type: aggregating | join + - type: aggregating key_column: [ , ... ] aggregation: [ , ... ] ... @@ -96,10 +96,10 @@ models: | Configuration | Description | |-------------------|-------------------------------------------------------------------------------------------| | `materialized` | How the model will be materialized into Firebolt. Must be `table` to create a fact table. | -| `table_type` | Whether the materialized table will be a [fact or dimension](https://docs.firebolt.io/working-with-tables.html#fact-and-dimension-tables) table. | +| `table_type` | Whether the materialized table will be a [fact or dimension](https://docs.firebolt.io/godocs/Overview/working-with-tables/working-with-tables.html#fact-and-dimension-tables) table. | | `primary_index` | Sets the primary index for the fact table using the inputted list of column names from the model. Required for fact tables. | | `indexes` | A list of aggregating indexes to create on the fact table. | -| `type` | Specifies whether the index is an aggregating index or join index. Join indexes only apply to dimension tables, so for fact tables set to `aggregating`. | +| `type` | Specifies that the index is an [aggregating index](https://docs.firebolt.io/godocs/Guides/working-with-indexes/using-aggregating-indexes.html). Should be set to `aggregating`. | | `key_column` | Sets the grouping of the aggregating index using the inputted list of column names from the model. | | `aggregation` | Sets the aggregations on the aggregating index using the inputted list of SQL agg expressions. | @@ -144,11 +144,7 @@ models: : +materialized: table +table_type: dimension - +indexes: - - type: join - join_column: - dimension_column: [ , ... ] - ... + ... ``` @@ -163,11 +159,7 @@ models: config: materialized: table table_type: dimension - indexes: - - type: join - join_column: - dimension_column: [ , ... ], - ... + ... ``` @@ -180,14 +172,7 @@ models: {{ config( materialized = "table", table_type = "dimension", - indexes = [ - { - type = "join", - join_column = "", - dimension_column: [ "", ... ] - }, - ... - ], + ... ) }} ``` @@ -195,39 +180,19 @@ models: +Dimension tables do not support aggregation indexes. #### Dimension Table Configurations | Configuration | Description | |--------------------|-------------------------------------------------------------------------------------------| | `materialized` | How the model will be materialized into Firebolt. Must be `table` to create a dimension table. | -| `table_type` | Whether the materialized table will be a [fact or dimension](https://docs.firebolt.io/working-with-tables.html#fact-and-dimension-tables) table. | -| `indexes` | A list of join indexes to create on the dimension table. | -| `type` | Specifies whether the index is an aggregating index or join index. Aggregating indexes only apply to fact tables, so for dimension tables set to `join`. | -| `join_column` | Sets the join key of the join index using the inputted column name from the model. | -| `dimension_column` | Sets the columns to be loaded into memory on the join index using the inputted list of column names from the mode. | +| `table_type` | Whether the materialized table will be a [fact or dimension](https://docs.firebolt.io/godocs/Overview/working-with-tables/working-with-tables.html#fact-and-dimension-tables) table. | -#### Example of a Dimension Table With a Join Index +## How Aggregating Indexes Are Named -``` -{{ config( - materialized = "table", - table_type = "dimension", - indexes = [ - { - type: "join", - join_column: "order_id", - dimension_column: ["customer_id", "status"] - } - ] -) }} -``` - - -## How Aggregating Indexes and Join Indexes Are Named - -In dbt-firebolt, you do not provide names for aggregating indexes and join indexes; they are named programmatically. dbt will generate index names using the following convention: +In dbt-firebolt, you do not provide names for aggregating indexes; they are named programmatically. dbt will generate index names using the following convention: ``` _____ @@ -240,7 +205,7 @@ For example, a join index could be named `my_users__id__join_1633504263` and an `dbt-firebolt` supports dbt's [external tables feature](https://docs.getdbt.com/reference/resource-properties/external), which allows dbt to manage the table ingestion process from S3 into Firebolt. This is an optional feature but can be highly convenient depending on your use case. -More information on using external tables including properly configuring IAM can be found in the Firebolt [documentation](https://docs.firebolt.io/sql-reference/commands/ddl-commands#create-external-table). +More information on using external tables including properly configuring IAM can be found in the Firebolt [documentation](https://docs.firebolt.io/godocs/Guides/loading-data/working-with-external-tables.html). #### Installation of External Tables Package @@ -288,8 +253,8 @@ sources: object_pattern: '' type: '' credentials: - internal_role_arn: arn:aws:iam::id:/ - external_role_id: + aws_key_id: + aws_secret_key: object_pattern: '' compression: '' partitions: @@ -301,6 +266,9 @@ sources: data_type: ``` +`aws_key_id` and `aws_secret_key` are the credentails that allow Firebolt access to your S3 bucket. Learn +how to set them up by following this [guide](https://docs.firebolt.io/godocs/Guides/loading-data/creating-access-keys-aws.html). If your bucket is public these parameters are not necessary. + #### Running External tables The `stage_external_sources` macro is inherited from the [dbt-external-tables package](https://github.com/dbt-labs/dbt-external-tables#syntax) and is the primary point of entry when using thes package. It has two operational modes: standard and "full refresh." @@ -311,11 +279,11 @@ $ dbt run-operation stage_external_sources # iterate through all source nodes, create or replace (no refresh command is required as data is fetched live from remote) $ dbt run-operation stage_external_sources --vars "ext_full_refresh: true" -``` +``` ## Incremental models -The [`incremental_strategy` configuration](/docs/build/incremental-strategy) controls how dbt builds incremental models. Firebolt currently supports the `append` configuration. You can specify `incremental_strategy` in `dbt_project.yml` or within a model file's `config()` block. The `append` configuration is the default. Specifying this configuration is optional. +The [`incremental_strategy` configuration](/docs/build/incremental-strategy) controls how dbt builds incremental models. Firebolt currently supports `append`, `insert_overwrite` and `delete+insert` configuration. You can specify `incremental_strategy` in `dbt_project.yml` or within a model file's `config()` block. The `append` configuration is the default. Specifying this configuration is optional. The `append` strategy performs an `INSERT INTO` statement with all the new data based on the model definition. This strategy doesn't update or delete existing rows, so if you do not filter the data to the most recent records only, it is likely that duplicate records will be inserted. From e4872730cd292b5da168ba58f332b5d5c46026a9 Mon Sep 17 00:00:00 2001 From: Mirna Wong <89008547+mirnawong1@users.noreply.github.com> Date: Wed, 7 Aug 2024 12:15:47 +0100 Subject: [PATCH 7/8] Update firebolt-configs.md change to sentence case --- .../resource-configs/firebolt-configs.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/website/docs/reference/resource-configs/firebolt-configs.md b/website/docs/reference/resource-configs/firebolt-configs.md index 58fc0e2a319..e37a711bac9 100644 --- a/website/docs/reference/resource-configs/firebolt-configs.md +++ b/website/docs/reference/resource-configs/firebolt-configs.md @@ -14,7 +14,7 @@ seeds: ``` -## Model Configuration for Fact Tables +## Model configuration for fact tables A dbt model can be created as a Firebolt fact and configured using the following syntax: @@ -91,7 +91,7 @@ models: -#### Fact Table Configurations +#### Fact table configurations | Configuration | Description | |-------------------|-------------------------------------------------------------------------------------------| @@ -104,7 +104,7 @@ models: | `aggregation` | Sets the aggregations on the aggregating index using the inputted list of SQL agg expressions. | -#### Example of a Fact Table With an Aggregating Index +#### Example of a fact table with an aggregating index ``` {{ config( @@ -122,7 +122,7 @@ models: ``` -## Model Configuration for Dimension Tables +## Model configuration for dimension tables A dbt model can be materialized as a Firebolt dimension table and configured using the following syntax: @@ -182,7 +182,7 @@ models: Dimension tables do not support aggregation indexes. -#### Dimension Table Configurations +#### Dimension table configurations | Configuration | Description | |--------------------|-------------------------------------------------------------------------------------------| @@ -201,7 +201,7 @@ In dbt-firebolt, you do not provide names for aggregating indexes; they are name For example, a join index could be named `my_users__id__join_1633504263` and an aggregating index could be named `my_orders__order_date__aggregating_1633504263`. -## Managing Ingestion via External Tables +## Managing ingestion via external tables `dbt-firebolt` supports dbt's [external tables feature](https://docs.getdbt.com/reference/resource-properties/external), which allows dbt to manage the table ingestion process from S3 into Firebolt. This is an optional feature but can be highly convenient depending on your use case. @@ -231,7 +231,7 @@ To install and use `dbt-external-tables` with Firebolt, you must: 3. Pull in the `packages.yml` dependencies by calling `dbt deps`. -#### Using External Tables +#### Using external tables To use external tables, you must define a table as `external` in your `dbt_project.yml` file. Every external table must contain the fields `url`, `type`, and `object_pattern`. Note that the Firebolt external table specification requires fewer fields than what is specified in the dbt documentation. @@ -269,7 +269,7 @@ sources: `aws_key_id` and `aws_secret_key` are the credentails that allow Firebolt access to your S3 bucket. Learn how to set them up by following this [guide](https://docs.firebolt.io/godocs/Guides/loading-data/creating-access-keys-aws.html). If your bucket is public these parameters are not necessary. -#### Running External tables +#### Running external tables The `stage_external_sources` macro is inherited from the [dbt-external-tables package](https://github.com/dbt-labs/dbt-external-tables#syntax) and is the primary point of entry when using thes package. It has two operational modes: standard and "full refresh." From fc185c48a2a17870660dd5b1158c753f69db01d3 Mon Sep 17 00:00:00 2001 From: Mirna Wong <89008547+mirnawong1@users.noreply.github.com> Date: Wed, 7 Aug 2024 12:17:09 +0100 Subject: [PATCH 8/8] Update firebolt-configs.md change to sentence case --- website/docs/reference/resource-configs/firebolt-configs.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/website/docs/reference/resource-configs/firebolt-configs.md b/website/docs/reference/resource-configs/firebolt-configs.md index e37a711bac9..394823e33de 100644 --- a/website/docs/reference/resource-configs/firebolt-configs.md +++ b/website/docs/reference/resource-configs/firebolt-configs.md @@ -190,7 +190,7 @@ Dimension tables do not support aggregation indexes. | `table_type` | Whether the materialized table will be a [fact or dimension](https://docs.firebolt.io/godocs/Overview/working-with-tables/working-with-tables.html#fact-and-dimension-tables) table. | -## How Aggregating Indexes Are Named +## How aggregating indexes are named In dbt-firebolt, you do not provide names for aggregating indexes; they are named programmatically. dbt will generate index names using the following convention: @@ -208,7 +208,7 @@ For example, a join index could be named `my_users__id__join_1633504263` and an More information on using external tables including properly configuring IAM can be found in the Firebolt [documentation](https://docs.firebolt.io/godocs/Guides/loading-data/working-with-external-tables.html). -#### Installation of External Tables Package +#### Installation of external tables package To install and use `dbt-external-tables` with Firebolt, you must: @@ -238,7 +238,7 @@ To use external tables, you must define a table as `external` in your `dbt_proje In addition to specifying the columns, an external table may specify partitions. Partitions are not columns and they cannot have the same name as columns. To avoid YAML parsing errors, remember to encase string literals (such as the `url` and `object_pattern` values) in single quotation marks. -#### dbt_project.yml Syntax For an External Table +#### dbt_project.yml syntax for an external table ```yml sources: