Skip to content

Commit

Permalink
[daggy-u] [dbt] - Add Lesson 5 (DEV-57) (#19947)
Browse files Browse the repository at this point in the history
This PR adds Lesson 5 of the new dbt module to Dagster University.

TODOs

- [x] Add screenshots
- [ ] Update code snippets to use file import

---------

Co-authored-by: Tim Castillo <[email protected]>
  • Loading branch information
2 people authored and alangenfeld committed Feb 28, 2024
1 parent 929c35a commit 46c29c7
Show file tree
Hide file tree
Showing 3 changed files with 60 additions and 57 deletions.
7 changes: 6 additions & 1 deletion docs/dagster-university/pages/dagster-dbt.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,9 @@ title: Dagster + dbt
- [Overview](/dagster-dbt/lesson-4/1-overview)
- [Speeding up the development cycle](/dagster-dbt/lesson-4/2-speeding-up-the-development-cycle)
- [Debugging failed runs](/dagster-dbt/lesson-4/3-debugging-failed-runs)
- [Customizing your execution](/dagster-dbt/lesson-4/4-customizing-your-execution)
- [Customizing your execution](/dagster-dbt/lesson-4/4-customizing-your-execution)
- Lesson 5: Adding dependencies and automation to dbt models
- [Overview](/dagster-dbt/lesson-5/1-overview)
- [Connecting dbt models to Dagster assets](/dagster-dbt/lesson-5/2-connecting-dbt-models-to-dagster-assets)
- [Creating assets that depend on dbt models](/dagster-dbt/lesson-5/3-creating-assets-that-depend-on-dbt-models)
- [Automating dbt models in Dagster](/dagster-dbt/lesson-5/4-automating-dbt-models-in-dagster)
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ if os.getenv("DAGSTER_DBT_PARSE_PROJECT_ON_LOAD"):
.target_path.joinpath("manifest.json")
)
else:
dbt_manifest_path = os.path.join(DBT_DIRECTORY, "target", "manifest.json")
dbt_manifest_path = DBT_DIRECTORY.joinpath("target", "manifest.json")


@dbt_assets(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,20 +75,18 @@ Let's start by adding a new string constant to reference when building the new a
In the `assets/constants.py` file, add the following to the end of the file:
```python
AIRPORT_TRIPS_FILE_PATH = get_path_for_env(os.path.join("data", "outputs", "airport_trips.png"))
AIRPORT_TRIPS_FILE_PATH = Path(__file__).joinpath("..", "..", "outputs", "airport_trips.png").resolve()
```
This creates a path to where we want to save the chart. The `get_path_for_env` utilty function is not specific to Dagster, but rather is a utility function we've defined in this file to help with Lesson 7 (Deploying your Dagster and dbt project).

### Creating the airport_trips asset
Now we’re ready to create the asset!
1. Open the `assets/metrics.py` file.
2. At the end of the file, define a new asset called `airport_trips` with the the existing `DuckDBResource` named `database` and it will return a `MaterializeResult`, indicating that we'll be returning some metadata:
2. At the end of the file, define a new asset called `airport_trips` with the context argument and the existing `DuckDBResource` named `database`:
```python
def airport_trips(database: DuckDBResource) -> MaterializeResult:
def airport_trips(context, database: DuckDBResource):
```
3. Add the asset decorator to the `airport_trips` function and specify the `location_metrics` model as a dependency:
Expand All @@ -97,61 +95,61 @@ Now we’re ready to create the asset!
@asset(
deps=["location_metrics"],
)
def airport_trips(database: DuckDBResource) -> MaterializeResult:
def airport_trips(context, database: DuckDBResource):
```
**Note:** Because Dagster doesn’t discriminate and treats all dbt models as assets, you’ll add this dependency just like you would with any other asset.
4. Fill in the body of the function with the following code to follow a similar pattern to your project’s existing pipelines: query for the data, use a library to generate a chart, save the chart as a file, and embed the chart:
```python
@asset(
deps=["location_metrics"],
)
def airport_trips(database: DuckDBResource) -> MaterializeResult:
"""
A chart of where trips from the airport go
"""
query = """
select
zone,
destination_borough,
trips
from location_metrics
where from_airport
"""
with database.get_connection() as conn:
airport_trips = conn.execute(query).fetch_df()
fig = px.bar(
airport_trips,
x="zone",
y="trips",
color="destination_borough",
barmode="relative",
labels={
"zone": "Zone",
"trips": "Number of Trips",
"destination_borough": "Destination Borough"
},
)
pio.write_image(fig, constants.AIRPORT_TRIPS_FILE_PATH)
with open(constants.AIRPORT_TRIPS_FILE_PATH, 'rb') as file:
image_data = file.read()
# Convert the image data to base64
base64_data = base64.b64encode(image_data).decode('utf-8')
md_content = f"![Image](data:image/jpeg;base64,{base64_data})"
return MaterializeResult(
metadata={
"preview": MetadataValue.md(md_content)
}
)
@asset(
deps=["location_metrics"],
)
def airport_trips(context, database: DuckDBResource):
"""
A chart of where trips from the airport go
"""
query = """
select
zone,
destination_borough,
trips
from location_metrics
where from_airport
"""
with database.get_connection() as conn:
airport_trips = conn.execute(query).fetch_df()
fig = px.bar(
airport_trips,
x="zone",
y="trips",
color="destination_borough",
barmode="relative",
labels={
"zone": "Zone",
"trips": "Number of Trips",
"destination_borough": "Destination Borough"
},
)
pio.write_image(fig, constants.AIRPORT_TRIPS_FILE_PATH)
with open(constants.AIRPORT_TRIPS_FILE_PATH, 'rb') as file:
image_data = file.read()
# Convert the image data to base64
base64_data = base64.b64encode(image_data).decode('utf-8')
md_content = f"![Image](data:image/jpeg;base64,{base64_data})"
#TODO: Use `MaterializeResult` instead
context.add_output_metadata({
"preview": MetadataValue.md(md_content),
"data": MetadataValue.json(airport_trips.to_dict(orient="records"))
})
```
5. Reload your code location to see the new `airport_trips` asset within the `metrics` group. Notice how the asset graph links the dependency between the `location_metrics` dbt asset and the new `airport_trips` chart asset.

0 comments on commit 46c29c7

Please sign in to comment.