Skip to content

Commit

Permalink
Update Databricks quickstart (#4564)
Browse files Browse the repository at this point in the history
## What are you changing in this pull request and why?

- Specify that the quickstart assumes using Partner Connect
- Include steps to connect using Partner Connect inline
- Remove unnecessary step (set up managed repository) that's only
required if connecting manually, not using Partner Connect
- Clarify required catalog/schema privileges
- Document Unity Catalog vs. legacy behavior/privileges

## Checklist
<!--
Uncomment if you're publishing docs for a prerelease version of dbt
(delete if not applicable):
- [ ] Add versioning components, as described in [Versioning
Docs](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-entire-pages)
- [ ] Add a note to the prerelease version [Migration
Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/docs/dbt-versions/core-upgrade)
-->
- [ ] Review the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md)
so my content adheres to these guidelines.
- [ ] For [docs
versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#about-versioning),
review how to [version a whole
page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version)
and [version a block of
content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content).
- [ ] Add a checklist item for anything that needs to happen before this
PR is merged, such as "needs technical review" or "change base branch."

Adding new pages (delete if not applicable):
- [ ] Add page to `website/sidebars.js`
- [ ] Provide a unique filename for the new page

Removing or renaming existing pages (delete if not applicable):
- [ ] Remove page from `website/sidebars.js`
- [ ] Add an entry `website/static/_redirects`
- [ ] Run link testing locally with `npm run build` to update the links
that point to the deleted page

---------

Co-authored-by: mirnawong1 <[email protected]>
Co-authored-by: Matt Shaver <[email protected]>
  • Loading branch information
3 people authored Jul 30, 2024
1 parent db7cbb8 commit aa19231
Showing 1 changed file with 51 additions and 4 deletions.
55 changes: 51 additions & 4 deletions website/docs/guides/databricks-qs.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,16 +169,63 @@ If you get a session error and don’t get redirected to this page, you can go b

There are two ways to connect dbt Cloud to Databricks. The first option is Partner Connect, which provides a streamlined setup to create your dbt Cloud account from within your new Databricks trial account. The second option is to create your dbt Cloud account separately and build the Databricks connection yourself (connect manually). If you want to get started quickly, dbt Labs recommends using Partner Connect. If you want to customize your setup from the very beginning and gain familiarity with the dbt Cloud setup flow, dbt Labs recommends connecting manually.

If you want to use Partner Connect, refer to [Connect to dbt Cloud using Partner Connect](https://docs.databricks.com/partners/prep/dbt-cloud.html#connect-to-dbt-cloud-using-partner-connect) in the Databricks docs for instructions.
## Set up the integration from Partner Connect

If you want to connect manually, refer to [Connect to dbt Cloud manually](https://docs.databricks.com/partners/prep/dbt-cloud.html#connect-to-dbt-cloud-manually) in the Databricks docs for instructions.
:::note
Partner Connect is intended for trial partner accounts. If your organization already has a dbt Cloud account, connect manually. Refer to [Connect to dbt Cloud manually](https://docs.databricks.com/partners/prep/dbt-cloud.html#connect-to-dbt-cloud-manually) in the Databricks docs for instructions.
:::

To connect dbt Cloud to Databricks using Partner Connect, do the following:

1. In the sidebar of your Databricks account, click **Partner Connect**.

2. Click the **dbt tile**.

3. Select a catalog from the drop-down list, and then click **Next**. The drop-down list displays catalogs you have read and write access to. If your workspace isn't `<UC>-enabled`, the legacy Hive metastore (`hive_metastore`) is used.
5. If there are SQL warehouses in your workspace, select a SQL warehouse from the drop-down list. If your SQL warehouse is stopped, click **Start**.
6. If there are no SQL warehouses in your workspace:
1. Click **Create warehouse**. A new tab opens in your browser that displays the **New SQL Warehouse** page in the Databricks SQL UI.
2. Follow the steps in [Create a SQL warehouse](https://docs.databricks.com/en/sql/admin/create-sql-warehouse.html#create-a-sql-warehouse) in the Databricks docs.
3. Return to the Partner Connect tab in your browser, and then close the **dbt tile**.
4. Re-open the **dbt tile**.
5. Select the SQL warehouse you just created from the drop-down list.
7. Select a schema from the drop-down list, and then click **Add**. The drop-down list displays schemas you have read and write access to. You can repeat this step to add multiple schemas.
## Set up a dbt Cloud managed repository
If you used Partner Connect, you can skip to [initializing your dbt project](#initialize-your-dbt-project-and-start-developing) as the Partner Connect provides you with a managed repository. Otherwise, you will need to create your repository connection.
Partner Connect creates the following resources in your workspace:
- A Databricks service principal named **DBT_CLOUD_USER**.
- A Databricks personal access token that is associated with the **DBT_CLOUD_USER** service principal.
Partner Connect also grants the following privileges to the **DBT_CLOUD_USER** service principal:
- (Unity Catalog) **USE CATALOG**: Required to interact with objects within the selected catalog.
- (Unity Catalog) **USE SCHEMA**: Required to interact with objects within the selected schema.
- (Unity Catalog) **CREATE SCHEMA**: Grants the ability to create schemas in the selected catalog.
- (Hive metastore) **USAGE**: Required to grant the **SELECT** and **READ_METADATA** privileges for the schemas you selected.
- **SELECT**: Grants the ability to read the schemas you selected.
- (Hive metastore) **READ_METADATA**: Grants the ability to read metadata for the schemas you selected.
- **CAN_USE**: Grants permissions to use the SQL warehouse you selected.
8. Click **Next**.
The **Email** box displays the email address for your Databricks account. dbt Labs uses this email address to prompt you to create a trial dbt Cloud account.
9. Click **Connect to dbt Cloud**.
A new tab opens in your web browser, which displays the getdbt.com website.
10. Complete the on-screen instructions on the getdbt.com website to create your trial dbt Cloud account.
## Set up a dbt Cloud managed repository
<Snippet path="tutorial-managed-repo" />
## Initialize your dbt project​ and start developing
Now that you have a repository configured, you can initialize your project and start development in dbt Cloud:
1. Click **Start developing in the IDE**. It might take a few minutes for your project to spin up for the first time as it establishes your git connection, clones your repo, and tests the connection to the warehouse.
Expand Down

0 comments on commit aa19231

Please sign in to comment.