Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add athena adapter preview #5858

Merged
merged 12 commits into from
Aug 2, 2024
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ pagination_prev: null
---
dbt Cloud can connect with a variety of data platform providers including:
- [AlloyDB](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb)
- [Amazon Athena (Preview)](/docs/cloud/connect-data-platform/connect-amazon-athena)
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved
- [Amazon Redshift](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb)
- [Apache Spark](/docs/cloud/connect-data-platform/connect-apache-spark)
- [Azure Synapse Analytics](/docs/cloud/connect-data-platform/connect-azure-synapse-analytics)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
title: "Connect Amazon Athena"
id: connect-amazon-athena
description: "Configure Amazon Athena connection."
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved
sidebar_label: "Connect Athena"
---

# Connect Amazon Athena <Lifecycle status="beta" />

:::note beta

This is a beta feature with limited availability. A public preview will follow shortly, for wider early access. For more information, check out our [product lifecycle](/docs/dbt-versions/product-lifecycles#dbt-cloud) page.

:::

Your environment(s) must be on ["Keep on latest version"](/docs/dbt-versions/versionless-cloud) to use the Amazon Athena connection.
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved

Connect dbt Cloud to Amazon's Athena interactive query service to build your dbt Project. The following are the required and optional fields for configuring the Athena connection:
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where should users add this field? maybe a screenshot would be helpful?


| Field | Option | Description | Type | Required? | Example |
| ----------------------------- | ---------------- | ----------------------------------------------------------------------------------- | ------ | --------- | ------- |
| AWS region name | region_name | AWS region of your Athena instance | String | Required | eu-west-1 |
| Database (catalog) | database | Specify the database (Data catalog) to build models into (lowercase only) | String | Required | awsdatacatalog |
| AWS S3 staging directory | s3_staging_dir | S3 location to store Athena query results and metadata | String | Required | s3://bucket/dbt/ |
| Athena workgroup | work_group | Identifier of Athena workgroup | String | Optional | my-custom-workgroup |
| Athena Spark workgroup | spark_work_group | Identifier of Athena Spark workgroup for running Python models | String | Optional | my-spark-workgroup |
| AWS S3 data directory | s3_data_dir | Prefix for storing tables, if different from the connection's s3_staging_dir | String | Optional | s3://bucket2/dbt/ |
| AWS S3 data naming convention | s3_data_naming | How to generate table paths in s3_data_dir | String | Optional | schema_table_unique |
| AWS S3 temp tables prefix | s3_tmp_table_dir | Prefix for storing temporary tables, if different from the connection's s3_data_dir | String | Optional | s3://bucket3/dbt/ |
| Poll interval | poll_interval | Interval in seconds to use for polling the status of query results in Athena | Integer| Optional | 5 |
| Query retries | num_retries | Number of times to retry a failing query | Integer| Optional | 3 |
| Boto3 retries | num_boto3_retries| Number of times to retry boto3 requests (e.g. deleting S3 files for materialized tables)| Integer | Optional | 5 |
| Iceberg retries | num_iceberg_retries| Number of times to retry iceberg commit queries to fix ICEBERG_COMMIT_ERROR | Integer | Optional | 0 |

### Development credentials

Enter your _development_ (not deployment) credentials with the following fields:

| Field | Option | Description | Type | Required | Example |
| --------------------- | --------------------- | -------------------------------------------------------------------------- | ------ | -------- | -------- |
| AWS Access Key ID | aws_access_key_id | Access key ID of the user performing requests | String | Required | AKIAIOSFODNN7EXAMPLE |
| AWS Secret Access Key | aws_secret_access_key | Secret access key of the user performing requests | String | Required | wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY |
| Schema | schema | Specify the schema (Athena database) to build models into (lowercase only) | String | Required | dbt |
| Threads | threads | | Integer| Optional | 3 |
4 changes: 4 additions & 0 deletions website/docs/docs/dbt-versions/release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@ Release notes are grouped by month for both multi-tenant and virtual private clo

[^*] The official release date for this new format of release notes is May 15th, 2024. Historical release notes for prior dates may not reflect all available features released earlier this year or their tenancy availability.

## August 2024

- **New:** The [Amazon Athena connection](/docs/cloud/connect-data-platform/connect-amazon-athena) for dbt Cloud is now available in [Preview](/docs/dbt-versions/product-lifecycles#dbt-cloud).

## July 2024
- **Behavior change:** dbt Cloud IDE automatically adds a `--limit 100` to preview queries to avoid slow and expensive queries during development. Recently, dbt Core changed how the `limit` is applied to ensure that `order by` clauses are consistently respected. Because of this, queries that already contain a limit clause might now cause errors in the IDE previews. To address this, dbt Labs plans to provide an option soon to disable the limit from being applied. Until then, dbt Labs recommends removing the (duplicate) limit clause from your queries during previews to avoid these IDE errors.

Expand Down
Loading