Skip to content

Commit

Permalink
updated code references and sidebar
Browse files Browse the repository at this point in the history
  • Loading branch information
C00ldudeNoonan committed Dec 27, 2024
1 parent 62e5fd0 commit aae8195
Show file tree
Hide file tree
Showing 10 changed files with 97 additions and 478 deletions.
467 changes: 0 additions & 467 deletions docs/docs-beta/docs/tutorial/08-refactoring-the-project.md

This file was deleted.

File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ last_update:

# Asset Dependencies

Now that we have the raw data loaded into DuckDB, we need to create a [downstream asset](guides/asset-dependencies.md) that combines the upstream assets together. In this step, you will:
Now that we have the raw data loaded into DuckDB, we need to create a [downstream asset](guides/build/asset-concepts/asset-dependencies) that combines the upstream assets together. In this step, you will:

- Create a downstream asset
- Materialize that asset
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ last_update:

Data Quality is critical in data pipelines. Much like in a factory producing cars, inspecting parts after they complete certain steps ensures that defects are caught before the car is completely assembled.

Check warning on line 10 in docs/docs-beta/docs/tutorial/etl-tutorial/04-ensuring-data-quality-with-asset-checks.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Dagster.chars-eol-whitespace] Remove whitespace characters from the end of the line. Raw Output: {"message": "[Dagster.chars-eol-whitespace] Remove whitespace characters from the end of the line.", "location": {"path": "docs/docs-beta/docs/tutorial/etl-tutorial/04-ensuring-data-quality-with-asset-checks.md", "range": {"start": {"line": 10, "column": 206}}}, "severity": "WARNING"}

In Dagster, you define [asset checks](guides/asset-checks.md) in a similar way that you would define an Asset. In this step you will:
In Dagster, you define [asset checks](guides/build/test/asset-checks) in a similar way that you would define an Asset. In this step you will:

- Define an asset check
- Execute that asset check in the UI
Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
---
title: Refactoring the Project
description: Refactor the completed project into a structure that is more organized and scalable.
last_update:
author: Alex Noonan
---

# Refactoring code

Many engineers generally leave something alone once its working as expected. But the first time you do something is rarely the best implementation of a use case and all projects benefit from incremental improvements.

## Splitting up project structure

Right now the project is contained within one definitions file. This has gotten kinda unwieldy and if we were to add more to the project it would only get more disorganized. So we're going to create separate files for all the different Dagster core concepts:

- Assets
- schedules
- sensors
- partitions

The final project structure should look like this:
```
dagster-etl-tutorial/
├── data/
│ └── products.csv
│ └── sales_data.csv
│ └── sales_reps.csv
│ └── sample_request/
│ └── request.json
├── etl_tutorial/
│ └── assets.py
│ └── definitions.py
│ └── partitions.py
│ └── schedules.py
│ └── sensors.py
├── pyproject.toml
├── setup.cfg
├── setup.py
```

### Assets

Assets make up a majority of our project and this will be our largest file.

<CodeExample filePath="guides/tutorials/etl_tutorial_completed/etl_tutorial/assets.py" language="python" lineStart="1" lineEnd="292"/>

### Schedules

The schedules file will only contain the `weekly_update_schedule`.

<CodeExample filePath="guides/tutorials/etl_tutorial_completed/etl_tutorial/schedules.py" language="python" lineStart="1" lineEnd="8"/>

### Sensors

The sensors file will have the job and sensor for the `adhoc_request` asset.

<CodeExample filePath="guides/tutorials/etl_tutorial_completed/etl_tutorial/sensors.py" language="python" lineStart="1" lineEnd="47"/>

## Adjusting definitions object

Now that we have separate files we need to adjust how the different elements are adding to definitions since they are in separate files

1. Imports

The Dagster project runs from the root directory so whenever you are doing file references you need to have that as the starting point.

Additionally, Dagster has functions to load all the assets `load_assets_from_modules` and asset checks `load_asset_checks_from_modules` from a module.

2. Definitions

To bring our project together copy the following code into your `definitions.py` file:

<CodeExample filePath="guides/tutorials/etl_tutorial_completed/etl_tutorial/definitions.py" language="python" lineStart="1" lineEnd="19"/>

## Quick Validation

If you want to validate that your definitions file loads and validates you can run the `dagster definitions validate` in the same directory that you would run `dagster dev`. This command is useful for CI/CD pipelines and allows you to check that your project loads correctly without starting the webserver.

## Thats it!

Check failure on line 79 in docs/docs-beta/docs/tutorial/etl-tutorial/08-refactoring-the-project.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Spelling] Did you really mean 'Thats'? Raw Output: {"message": "[Vale.Spelling] Did you really mean 'Thats'?", "location": {"path": "docs/docs-beta/docs/tutorial/etl-tutorial/08-refactoring-the-project.md", "range": {"start": {"line": 79, "column": 4}}}, "severity": "ERROR"}

Check failure on line 79 in docs/docs-beta/docs/tutorial/etl-tutorial/08-refactoring-the-project.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Dagster.spelling] Is 'Thats' spelled correctly? Raw Output: {"message": "[Dagster.spelling] Is 'Thats' spelled correctly?", "location": {"path": "docs/docs-beta/docs/tutorial/etl-tutorial/08-refactoring-the-project.md", "range": {"start": {"line": 79, "column": 4}}}, "severity": "ERROR"}

Congratulations! You have completed your first project with Dagster and have an example of how to use the building blocks to build your own data pipelines.

## Recommended next steps

- Join our [Slack community](https://dagster.io/slack).
- Continue learning with [Dagster University](https://courses.dagster.io/) courses.
- Start a [free trial of Dagster+](https://dagster.cloud/signup) for your own project.
17 changes: 8 additions & 9 deletions docs/docs-beta/sidebars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,14 @@ const sidebars: SidebarsConfig = {
type: 'category',
label: 'Tutorial',
collapsed: false,
items: ['tutorial/tutorial-etl/',
'tutorial/etl-tutorial-introduction',
'tutorial/create-and-materialize-assets',
'tutorial/create-and-materialize-a-downstream-asset',
'tutorial/ensuring-data-quality-with-asset-checks',
'tutorial/create-and-materialize-partitioned-asset',
'tutorial/automating-your-pipeline',
'tutorial/creating-a-sensor-asset',
'tutorial/refactoring-the-project',
items: ['tutorial/etl-tutorial/etl-tutorial-introduction',
'tutorial/etl-tutorial/create-and-materialize-assets',
'tutorial/etl-tutorial/create-and-materialize-a-downstream-asset',
'tutorial/etl-tutorial/ensuring-data-quality-with-asset-checks',
'tutorial/etl-tutorial/create-and-materialize-partitioned-asset',
'tutorial/etl-tutorial/automating-your-pipeline',
'tutorial/etl-tutorial/creating-a-sensor-asset',
'tutorial/etl-tutorial/refactoring-the-project',
'tutorial/multi-asset-integration'],
},
{
Expand Down

0 comments on commit aae8195

Please sign in to comment.