From f73d328dad1a1d81146ec4e9452b42c648d79527 Mon Sep 17 00:00:00 2001 From: Maxime Armstrong <46797220+maximearmstrong@users.noreply.github.com> Date: Tue, 6 Aug 2024 21:56:28 -0400 Subject: [PATCH] [daggy-u][dbt] Update dbt course to use DbtProject (#23098) ## Summary & Motivation This PR updates the dbt course in Dagster University to use DbtProject. To do outside this PR: - update the knowledge check/quiz in Lesson 4. The `DAGSTER_DBT_PARSE_PROJECT_ON_LOAD` env var is no longer used. ## How I Tested These Changes (cherry picked from commit b6156a414449a2a5bc6918ace40939dd7ed71a4a) --- ...ing-the-dbt-project-location-in-dagster.md | 31 --------- ...representing-the-dbt-project-in-dagster.md | 37 ++++++++++ .../4-creating-a-dbt-resource-in-dagster.md | 8 +-- ...ading-dbt-models-into-dagster-as-assets.md | 31 +++------ .../dagster-dbt/lesson-3/knowledge-check.md | 2 +- .../2-speeding-up-the-development-cycle.md | 69 +++---------------- ...connecting-dbt-models-to-dagster-assets.md | 25 ++----- .../3-creating-a-partitioned-dbt-asset.md | 30 +++----- ...creating-the-manifest-during-deployment.md | 27 +++++--- .../5-preparing-for-a-successful-run.md | 52 +++++++++++--- 10 files changed, 133 insertions(+), 179 deletions(-) delete mode 100644 docs/dagster-university/pages/dagster-dbt/lesson-3/3-defining-the-dbt-project-location-in-dagster.md create mode 100644 docs/dagster-university/pages/dagster-dbt/lesson-3/3-representing-the-dbt-project-in-dagster.md diff --git a/docs/dagster-university/pages/dagster-dbt/lesson-3/3-defining-the-dbt-project-location-in-dagster.md b/docs/dagster-university/pages/dagster-dbt/lesson-3/3-defining-the-dbt-project-location-in-dagster.md deleted file mode 100644 index b6dfd94c2c234..0000000000000 --- a/docs/dagster-university/pages/dagster-dbt/lesson-3/3-defining-the-dbt-project-location-in-dagster.md +++ /dev/null @@ -1,31 +0,0 @@ ---- -title: 'Lesson 3: Defining the dbt project location in Dagster' -module: 'dagster_dbt' -lesson: '3' ---- - -# Defining the dbt project location in Dagster - -As you’ll frequently point your Dagster code to the `target/manifest.json` file and your dbt project in this course, it’ll be helpful to keep a reusable constant to reference where the dbt project is. - -In the finished Dagster Essentials project, there should be a file called `assets/constants.py`. Open that file and add the following import at the top: - -```python -from pathlib import Path -``` - -The `Path` class from the `pathlib` standard library will help us create an accurate pointer to where our dbt project is. At the bottom of `constants.py`, add the following line: - -```python -DBT_DIRECTORY = Path(__file__).joinpath("..", "..", "..", "analytics").resolve() -``` - -This line creates a new constant called `DBT_DIRECTORY`. This line might look a little complicated, so let’s break it down: - -- It uses the location of the `constants.py` file (via `__file__`) as a point of reference for finding the dbt project -- The arguments in `joinpath` point us towards our dbt project by appending the following to the current path: - - Three directory levels up (`"..", "..", ".."`) - - A directory named `analytics`, which is the directory containing our dbt project -- The `resolve` method turns that path into an absolute file path that points to the dbt project correctly from any file we’re working in - -Now that you can access your dbt project from any other file with the `DBT_DIRECTORY` constant, let’s move on to the first place where you’ll use it: creating the Dagster resource that will run dbt. \ No newline at end of file diff --git a/docs/dagster-university/pages/dagster-dbt/lesson-3/3-representing-the-dbt-project-in-dagster.md b/docs/dagster-university/pages/dagster-dbt/lesson-3/3-representing-the-dbt-project-in-dagster.md new file mode 100644 index 0000000000000..678f150e0f824 --- /dev/null +++ b/docs/dagster-university/pages/dagster-dbt/lesson-3/3-representing-the-dbt-project-in-dagster.md @@ -0,0 +1,37 @@ +--- +title: 'Lesson 3: Representing the dbt project in Dagster' +module: 'dagster_dbt' +lesson: '3' +--- + +# Representing the dbt project in Dagster + +As you’ll frequently point your Dagster code to the `target/manifest.json` file and your dbt project in this course, it’ll be helpful to keep a reusable representation of the dbt project. This can be easily done using the `DbtProject` class. + +In the `dagster_university` directory, create a new `project.py` file and add the following imports: + +```python +from pathlib import Path + +from dagster_dbt import DbtProject +``` + +The `Path` class from the `pathlib` standard library will help us create an accurate pointer to where our dbt project is. The `DbtProject` class is imported from the `dagster_dbt` package that we installed earlier. + +After the import, add the following code: + +```python +dbt_project = DbtProject( + project_dir=Path(__file__).joinpath("..", "..", "analytics").resolve(), +) +``` + +This code creates a representation of the dbt project called `dbt_project`. The code defining the location of the project directory might look a little complicated, so let’s break it down: + +- The location of the `project.py` file (via `__file__`) is used as a point of reference for finding the dbt project +- The arguments in `joinpath` point us towards our dbt project by appending the following to the current path: + - Three directory levels up (`"..", "..", ".."`) + - A directory named `analytics`, which is the directory containing our dbt project +- The `resolve` method turns that path into an absolute file path that points to the dbt project correctly from any file we’re working in + +Now that you can access your dbt project from any other file with the `dbt_project` representation, let’s move on to the first place where you’ll use it: creating the Dagster resource that will run dbt. \ No newline at end of file diff --git a/docs/dagster-university/pages/dagster-dbt/lesson-3/4-creating-a-dbt-resource-in-dagster.md b/docs/dagster-university/pages/dagster-dbt/lesson-3/4-creating-a-dbt-resource-in-dagster.md index bead2210365b5..fb3cfdf1e1812 100644 --- a/docs/dagster-university/pages/dagster-dbt/lesson-3/4-creating-a-dbt-resource-in-dagster.md +++ b/docs/dagster-university/pages/dagster-dbt/lesson-3/4-creating-a-dbt-resource-in-dagster.md @@ -20,18 +20,18 @@ Navigate to the `dagster_university/resources/__init__.py`, which is where other ```python from dagster_dbt import DbtCliResource -from ..assets.constants import DBT_DIRECTORY +from ..project import dbt_project # the import lines go at the top of the file # this can be defined anywhere below the imports dbt_resource = DbtCliResource( - project_dir=DBT_DIRECTORY, + project_dir=dbt_project, ) ``` The code above: 1. Imports the `DbtCliResource` from the `dagster_dbt` package that we installed earlier -2. Imports the `DBT_DIRECTORY` constant we just defined +2. Imports the `dbt_project` representation we just defined 3. Instantiates a new `DbtCliResource` under the variable name `dbt_resource` -4. Tells the resource that the dbt project to execute is found at `DBT_DIRECTORY` +4. Tells the resource that the dbt project to execute is the `dbt_project` diff --git a/docs/dagster-university/pages/dagster-dbt/lesson-3/5-loading-dbt-models-into-dagster-as-assets.md b/docs/dagster-university/pages/dagster-dbt/lesson-3/5-loading-dbt-models-into-dagster-as-assets.md index c9b7cdef5a053..19d4fb7140226 100644 --- a/docs/dagster-university/pages/dagster-dbt/lesson-3/5-loading-dbt-models-into-dagster-as-assets.md +++ b/docs/dagster-university/pages/dagster-dbt/lesson-3/5-loading-dbt-models-into-dagster-as-assets.md @@ -35,30 +35,22 @@ We’ll only create one `@dbt_assets` definition for now, but in a later lesson, ```python from dagster import AssetExecutionContext from dagster_dbt import dbt_assets, DbtCliResource - - import os - - from .constants import DBT_DIRECTORY - ``` - -3. The `@dbt_assets` decorator requires a path to the project’s manifest file, which is within our `DBT_DIRECTORY`. Use that constant to create a path to the `manifest.json` by copying and pasting the code below: - - ```python - dbt_manifest_path = os.path.join(DBT_DIRECTORY, "target", "manifest.json") + + from ..project import dbt_project ``` - Similar to how we used `joinpath` earlier to point to the dbt project’s directory, we’re using it once again to reference `target/manifest.json` more precisely. - -4. Now, use the `@dbt_assets` decorator to create a new asset function and provide it with a reference to the manifest: +3. Next, we'll use the `@dbt_assets` decorator to create a new asset function and provide it with a reference to the project's manifest file: ```python @dbt_assets( - manifest=dbt_manifest_path, + manifest=dbt_project.manifest_path, ) def dbt_analytics(context: AssetExecutionContext, dbt: DbtCliResource): ``` -5. Finally, add the following to the body of `dbt_analytics` function: + Here, we used `dbt_project.manifest_path` to provide the reference to the project's manifest file. This is possible because the `dbt_project` representation we created earlier contains the manifest path, accessible by using the `manifest_path` attribute. + +4. Finally, add the following to the body of `dbt_analytics` function: ```python yield from dbt.cli(["run"], context=context).stream() @@ -77,16 +69,13 @@ At this point, `dbt.py` should look like this: ```python from dagster import AssetExecutionContext -from dagster_dbt import dbt_assets, DbtCliResource - -from .constants import DBT_DIRECTORY - +from dagster_dbt import DbtCliResource, dbt_assets -dbt_manifest_path = DBT_DIRECTORY.joinpath("target", "manifest.json") +from ..project import dbt_project @dbt_assets( - manifest=dbt_manifest_path, + manifest=dbt_project.manifest_path, ) def dbt_analytics(context: AssetExecutionContext, dbt: DbtCliResource): yield from dbt.cli(["run"], context=context).stream() diff --git a/docs/dagster-university/pages/dagster-dbt/lesson-3/knowledge-check.md b/docs/dagster-university/pages/dagster-dbt/lesson-3/knowledge-check.md index 019cd1b6ae221..87ada1d52e455 100644 --- a/docs/dagster-university/pages/dagster-dbt/lesson-3/knowledge-check.md +++ b/docs/dagster-university/pages/dagster-dbt/lesson-3/knowledge-check.md @@ -12,7 +12,7 @@ lesson: '3' ```python @dbt_assets( - manifest=dbt_manifest_path + manifest=dbt_project.manifest_path ) def dbt_analytics(context: AssetExecutionContext, dbt: DbtCliResource): yield from dbt.cli(["build"], context=context).stream() diff --git a/docs/dagster-university/pages/dagster-dbt/lesson-4/2-speeding-up-the-development-cycle.md b/docs/dagster-university/pages/dagster-dbt/lesson-4/2-speeding-up-the-development-cycle.md index 6c148e16df26d..5b31e3659a05a 100644 --- a/docs/dagster-university/pages/dagster-dbt/lesson-4/2-speeding-up-the-development-cycle.md +++ b/docs/dagster-university/pages/dagster-dbt/lesson-4/2-speeding-up-the-development-cycle.md @@ -6,79 +6,30 @@ lesson: '4' # Speeding up the development cycle -By now, you’ve had to run `dbt parse` and reload your code location quite frequently, which doesn’t feel like the cleanest developer experience. +By now, you’ve had to run `dbt parse` to create the manifest file and reload your code location quite frequently, which doesn’t feel like the cleanest developer experience. -Before we move on, we’ll reduce the number of steps in the feedback loop. We'll automate the `dbt parse` command by taking advantage of the `DbtCliResource` that we wrote earlier. +Before we move on, we’ll reduce the number of steps in the feedback loop. We'll automate the creation of the manifest file by taking advantage of the `dbt_project` representation that we wrote earlier. --- -## Automating running dbt parse in development +## Automating creating the manifest file in development -The first detail is that resources don’t need to be part of an asset to be executed. This means that once a `dbt_resource` is defined, you can use it to execute commands when your code location is being built. Rather than manually running `dbt parse`, let’s use the `dbt_resource` to run the command for us. +The first detail is that the `dbt_project` doesn’t need to be part of an asset to be executed. This means that once a `dbt_project` is defined, you can use it to execute commands when your code location is being built. Rather than manually running `dbt parse`, let’s use the `dbt_project` to prepare the manifest file for us. -In `dbt.py`, import the `dbt_resource` and the `Path` class from the `pathlib` standard library with: +In `project.py`, after the code initializing `dbt_project`, add the following code: ```python -from pathlib import Path - -from ..resources import dbt_resource -``` - -Afterward, above your `dbt_manifest_path` declaration, add this snippet to run `dbt parse`: - -```python -dbt_resource.cli(["--quiet", "parse"], target_path=Path("target")).wait() +dbt_project.prepare_if_dev() ``` -If you look at the dbt project’s `/target` directory, you’ll see it stores the artifacts. To read from the generated manifest, you can retrieve the path to this folder from the return value of the `.wait()` call. - -Let’s define a new `dbt_manifest_path` that will always point to the `manifest.json` that was just created from this programmatic `dbt parse` command: - -```python -dbt_manifest_path = ( - dbt_resource.cli( - ["--quiet", "parse"], - target_path=Path("target"), - ) - .wait() - .target_path.joinpath("manifest.json") -) -``` +If you look at the dbt project’s `/target` directory, you’ll see it stores the artifacts. When you use `dagster dev` in local development and you reload your code, you'll see that a new manifest file is generated. Reload your code location in the Dagster UI, and you’ll see that everything should still work: the dbt models are still shown as assets and you can manually materialize any of the models. The key difference is that you no longer have to manually run `dbt parse` anymore! --- -## Specifying manifest build behavior in production - -This is great, however, it might feel a bit greedy and intensive to be constantly building a new manifest file. This is especially the case in production where a dbt project is stable. Therefore, let’s lock this computation behind an environment variable and defer to a single copy of our manifest in production. - -1. In the `.env` file, define an environment variable named `DAGSTER_DBT_PARSE_PROJECT_ON_LOAD` and set it to `1`: - - ```python - DUCKDB_DATABASE=data/staging/data.duckdb - DAGSTER_DBT_PARSE_PROJECT_ON_LOAD=1 # New env var defined here - ``` - -2. Next, import the `os` module at the top of the `dbt.py` file so the environment variable is accessible: - - ```python - import os - ``` - -3. Finally, let’s check to see if the variable is set: - - - **If it is**, we’ll use our new logic to generate a new manifest file every time the code location is built - - **If it isn’t**, then we’ll use our old logic of depending on a specific `manifest.json` in the `target` directory. +## Creating the manifest for production - Copy and paste the code to finalize the definition of `dbt_manifest_path`: +This is great, however, it only handles the preparation of a new manifest file in local development. In production, where a dbt project is stable, we may want to prepare a new manifest file only at build time, during the deployment process. This can be done using the command line interface (CLI) available in the `dagster_dbt` package. - ```python - if os.getenv("DAGSTER_DBT_PARSE_PROJECT_ON_LOAD"): - dbt_manifest_path = ( - dbt_resource.cli(["--quiet", "parse"]).wait() - .target_path.joinpath("manifest.json") - ) - else: - dbt_manifest_path = os.path.join(DBT_DIRECTORY, "target", "manifest.json") - ``` +Don't worry about the details for now! In Lesson 7, we’ll discuss the details on how to create a manifest file programmatically during deployment using the `dagster_dbt` CLI. diff --git a/docs/dagster-university/pages/dagster-dbt/lesson-5/2-connecting-dbt-models-to-dagster-assets.md b/docs/dagster-university/pages/dagster-dbt/lesson-5/2-connecting-dbt-models-to-dagster-assets.md index 2139852b2c13d..6f0851e9b674f 100644 --- a/docs/dagster-university/pages/dagster-dbt/lesson-5/2-connecting-dbt-models-to-dagster-assets.md +++ b/docs/dagster-university/pages/dagster-dbt/lesson-5/2-connecting-dbt-models-to-dagster-assets.md @@ -95,7 +95,7 @@ Open the `assets/dbt.py` file and do the following: ```python @dbt_assets( - manifest=dbt_manifest_path, + manifest=dbt_project.manifest_path, dagster_dbt_translator=CustomizedDagsterDbtTranslator() ) def dbt_analytics(context: AssetExecutionContext, dbt: DbtCliResource): @@ -105,12 +105,10 @@ Open the `assets/dbt.py` file and do the following: At this point, your `dbt.py` file should match the following: ```python -import os from dagster import AssetExecutionContext, AssetKey -from dagster_dbt import dbt_assets, DbtCliResource, DagsterDbtTranslator +from dagster_dbt import DagsterDbtTranslator, DbtCliResource, dbt_assets -from .constants import DBT_DIRECTORY -from ..resources import dbt_resource +from ..project import dbt_project class CustomizedDagsterDbtTranslator(DagsterDbtTranslator): @@ -122,21 +120,10 @@ class CustomizedDagsterDbtTranslator(DagsterDbtTranslator): else: return super().get_asset_key(dbt_resource_props) - -dbt_resource.cli(["--quiet", "parse"]).wait() - -if os.getenv("DAGSTER_DBT_PARSE_PROJECT_ON_LOAD"): - dbt_manifest_path = ( - dbt_resource.cli(["--quiet", "parse"]) - .wait() - .target_path.joinpath("manifest.json") - ) -else: - dbt_manifest_path = os.path.join(DBT_DIRECTORY, "target", "manifest.json") - - + @dbt_assets( - manifest=dbt_manifest_path, dagster_dbt_translator=CustomizedDagsterDbtTranslator() + manifest=dbt_project.manifest_path, + dagster_dbt_translator=CustomizedDagsterDbtTranslator(), ) def dbt_analytics(context: AssetExecutionContext, dbt: DbtCliResource): yield from dbt.cli(["build"], context=context).stream() diff --git a/docs/dagster-university/pages/dagster-dbt/lesson-6/3-creating-a-partitioned-dbt-asset.md b/docs/dagster-university/pages/dagster-dbt/lesson-6/3-creating-a-partitioned-dbt-asset.md index 51f1643c78598..98df443878c20 100644 --- a/docs/dagster-university/pages/dagster-dbt/lesson-6/3-creating-a-partitioned-dbt-asset.md +++ b/docs/dagster-university/pages/dagster-dbt/lesson-6/3-creating-a-partitioned-dbt-asset.md @@ -45,7 +45,7 @@ Previously, we used the `@dbt_assets` decorator to say _“this function produce ```python @dbt_assets( - manifest=dbt_manifest_path, + manifest=dbt_project.manifest_path, dagster_dbt_translator=CustomizedDagsterDbtTranslator() ) def incremental_dbt_models( @@ -59,7 +59,7 @@ Previously, we used the `@dbt_assets` decorator to say _“this function produce ```python @dbt_assets( - manifest=dbt_manifest_path, + manifest=dbt_project.manifest_path, dagster_dbt_translator=CustomizedDagsterDbtTranslator(), select=INCREMENTAL_SELECTOR, # select only models with INCREMENTAL_SELECTOR partitions_def=daily_partition # partition those models using daily_partition @@ -110,7 +110,7 @@ Modify the `dbt_analytics` definition to exclude the `INCREMENTAL_SELECTOR`: ```python @dbt_assets( - manifest=dbt_manifest_path, + manifest=dbt_project.manifest_path, dagster_dbt_translator=CustomizedDagsterDbtTranslator(), exclude=INCREMENTAL_SELECTOR, # Add this here ) @@ -121,15 +121,13 @@ def dbt_analytics(context: AssetExecutionContext, dbt: DbtCliResource): At this point, the `dagster_university/assets/dbt.py` file should look like this: ```python -import os import json + from dagster import AssetExecutionContext, AssetKey -from dagster_dbt import dbt_assets, DbtCliResource, DagsterDbtTranslator +from dagster_dbt import DagsterDbtTranslator, DbtCliResource, dbt_assets -from .constants import DBT_DIRECTORY from ..partitions import daily_partition -from ..resources import dbt_resource - +from ..project import dbt_project INCREMENTAL_SELECTOR = "config.materialized:incremental" @@ -144,20 +142,8 @@ class CustomizedDagsterDbtTranslator(DagsterDbtTranslator): return super().get_asset_key(dbt_resource_props) -dbt_resource.cli(["--quiet", "parse"]).wait() - -if os.getenv("DAGSTER_DBT_PARSE_PROJECT_ON_LOAD"): - dbt_manifest_path = ( - dbt_resource.cli(["--quiet", "parse"]) - .wait() - .target_path.joinpath("manifest.json") - ) -else: - dbt_manifest_path = DBT_DIRECTORY.joinpath("target", "manifest.json") - - @dbt_assets( - manifest=dbt_manifest_path, + manifest=dbt_project.manifest_path, dagster_dbt_translator=CustomizedDagsterDbtTranslator(), exclude=INCREMENTAL_SELECTOR, ) @@ -166,7 +152,7 @@ def dbt_analytics(context: AssetExecutionContext, dbt: DbtCliResource): @dbt_assets( - manifest=dbt_manifest_path, + manifest=dbt_project.manifest_path, dagster_dbt_translator=CustomizedDagsterDbtTranslator(), select=INCREMENTAL_SELECTOR, partitions_def=daily_partition, diff --git a/docs/dagster-university/pages/dagster-dbt/lesson-7/4-creating-the-manifest-during-deployment.md b/docs/dagster-university/pages/dagster-dbt/lesson-7/4-creating-the-manifest-during-deployment.md index 25f09d1a0c90a..762413ea6c33f 100644 --- a/docs/dagster-university/pages/dagster-dbt/lesson-7/4-creating-the-manifest-during-deployment.md +++ b/docs/dagster-university/pages/dagster-dbt/lesson-7/4-creating-the-manifest-during-deployment.md @@ -6,13 +6,13 @@ lesson: '7' # Creating the manifest during deployment -To recap, our deployment failed in the last section because Dagster couldn’t find a dbt manifest file, which it needs to turn dbt models into Dagster assets. This is because we built this file by running `dbt parse` during local development. You ran this manually in Lesson 3 and improved the experience in Lesson 4. However, you'll also need to build your dbt manifest file during deployment, which will require a couple additional steps. We recommend adopting CI/CD to automate this process. +To recap, our deployment failed in the last section because Dagster couldn’t find a dbt manifest file, which it needs to turn dbt models into Dagster assets. This is because we built this file by running `dbt parse` during local development. You ran this manually in Lesson 3 and improved the experience using `DbtProject`'s `prepare_if_dev` in Lesson 4. However, you'll also need to build your dbt manifest file during deployment, which will require a couple additional steps. We recommend adopting CI/CD to automate this process. -Building your manifest for your production deployment will will be needed for both open source and Dagster+ deployments. In this case, Dagster+’s out-of-the-box `deploy.yml` GitHub Action isn’t aware that you’re also trying to deploy a dbt project with Dagster. +Building your manifest for your production deployment will be needed for both open source and Dagster+ deployments. In this case, Dagster+’s out-of-the-box `deploy.yml` GitHub Action isn’t aware that you’re also trying to deploy a dbt project with Dagster. -Since your CI/CD will be running in a fresh environment, you'll need to install dbt and run `dbt deps` before building your manifest with `dbt parse`. +Since your CI/CD will be running in a fresh environment, you'll need to install dbt and other dependencies before building your manifest. -To get our deployment working, we need to add a step to our GitHub Actions workflow that runs the dbt commands required to generate the `manifest.json`. Specifically, we need to run `dbt deps` and `dbt parse` in the dbt project, just like you did during local development. +To get our deployment working, we need to add a step to our GitHub Actions workflow that runs the commands required to generate the `manifest.json`. Specifically, we need to run the `dbt project prepare-and-package` command, available in the `dagster_dbt` package. 1. In your Dagster project, locate the `.github/workflows` directory. 2. Open the `deploy.yml` file. @@ -20,18 +20,23 @@ To get our deployment working, we need to add a step to our GitHub Actions workf 4. After this step, add the following: ```yaml - - name: Parse dbt project and package with Dagster project + - name: Prepare DBT project for deployment if: steps.prerun.outputs.result == 'pex-deploy' run: | pip install pip --upgrade - pip install dbt-duckdb - cd project-repo/analytics - dbt deps - dbt parse + cd project-repo + pip install . --upgrade --upgrade-strategy eager + dagster-dbt project prepare-and-package --file dagster_university/project.py shell: bash ``` - -5. Save and commit the changes. Make sure to push them to the remote! + +The code above: + +1. Creates a step named `Prepare DBT project for deployement` +2. Upgrades `pip`, the package installer for Python +3. Navigates inside the `project-repo` folder +4. Upgrades the project dependencies +5. Prepares the manifest file by running the `dagster-dbt project prepare-and-package` command, specifying the file in which the `DbtProject` object is located. Once the new step is pushed to the remote, GitHub will automatically try to run a new job using the updated workflow. diff --git a/docs/dagster-university/pages/dagster-dbt/lesson-7/5-preparing-for-a-successful-run.md b/docs/dagster-university/pages/dagster-dbt/lesson-7/5-preparing-for-a-successful-run.md index daf87efc8fef8..d7db4993d1e0b 100644 --- a/docs/dagster-university/pages/dagster-dbt/lesson-7/5-preparing-for-a-successful-run.md +++ b/docs/dagster-university/pages/dagster-dbt/lesson-7/5-preparing-for-a-successful-run.md @@ -61,22 +61,47 @@ Because we’re still using a DuckDB-backed database, our `type` will also be `d --- +## Adding a prod target to DbtProject + +Next, we need to update the `DbtProject` object in `dagster_university/project.py` to specify what profile to target. To optimize the developer experience, let’s use an environment variable to specify the profile to target. + +1. In the `.env` file, define an environment variable named `DBT_TARGET` and set it to `dev`: + + ```python + DBT_TARGET=dev + ``` + +2. Next, import the `os` module at the top of the `project.py` file so the environment variable is accessible: + + ```python + import os + ``` + +3. Finally, scroll to the initialization of the `DbtProject` object, and use the new environment variable to access the profile to target. This should be on or around line 11: + +```python +dbt_project = DbtProject( + project_dir=Path(__file__).joinpath("..", "..", "analytics").resolve(), + target=os.getenv("DBT_TARGET") +) +``` + +--- + ## Adding a prod target to deploy.yml Next, we need to update the dbt commands in the `.github/workflows/deploy.yml` file to target the new `prod` profile. This will ensure that dbt uses the correct connection details when the GitHub Action runs as part of our Dagster+ deployment. -Open the file, scroll to the dbt step you added, and add `-- target prod` after the `dbt parse` command. This command should be on or around line 52: +Open the file, scroll to the environment variable section, and set an environment variable named `DBT_TARGET` to `prod`. This should be on or around line 12: ```bash -- name: Parse dbt project and package with Dagster project - if: steps.prerun.outputs.result == 'pex-deploy' - run: | - pip install pip --upgrade - pip install dbt-duckdb - cd project-repo/analytics - dbt deps - dbt parse --target prod ## add this flag - shell: bash +env: + DAGSTER_CLOUD_URL: ${{ secrets.DAGSTER_CLOUD_ORGANIZATION }} + DAGSTER_CLOUD_API_TOKEN: ${{ secrets.DAGSTER_CLOUD_API_TOKEN }} + ENABLE_FAST_DEPLOYS: 'true' + PYTHON_VERSION: '3.8' + DAGSTER_CLOUD_FILE: 'dagster_cloud.yaml' + DBT_TARGET: 'prod' ``` Save and commit the file to git. Don’t forget to push to remote! @@ -104,7 +129,12 @@ The following table contains the environment variables we need to create in Dags --- - `DAGSTER_ENVIRONMENT` -- Set this to `prod`. This will be used by your dbt resource to decide which target to use. +- Set this to `prod`. This will be used by your resources and constants. + +--- + +- `DBT_TARGET` +- Set this to `prod`. This will be used by your dbt project and dbt resource to decide which target to use. ---