-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
62e5fd0
commit aae8195
Showing
10 changed files
with
97 additions
and
478 deletions.
There are no files selected for viewing
467 changes: 0 additions & 467 deletions
467
docs/docs-beta/docs/tutorial/08-refactoring-the-project.md
This file was deleted.
Oops, something went wrong.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
File renamed without changes.
87 changes: 87 additions & 0 deletions
87
docs/docs-beta/docs/tutorial/etl-tutorial/08-refactoring-the-project.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
--- | ||
title: Refactoring the Project | ||
description: Refactor the completed project into a structure that is more organized and scalable. | ||
last_update: | ||
author: Alex Noonan | ||
--- | ||
|
||
# Refactoring code | ||
|
||
Many engineers generally leave something alone once its working as expected. But the first time you do something is rarely the best implementation of a use case and all projects benefit from incremental improvements. | ||
|
||
## Splitting up project structure | ||
|
||
Right now the project is contained within one definitions file. This has gotten kinda unwieldy and if we were to add more to the project it would only get more disorganized. So we're going to create separate files for all the different Dagster core concepts: | ||
|
||
- Assets | ||
- schedules | ||
- sensors | ||
- partitions | ||
|
||
The final project structure should look like this: | ||
``` | ||
dagster-etl-tutorial/ | ||
├── data/ | ||
│ └── products.csv | ||
│ └── sales_data.csv | ||
│ └── sales_reps.csv | ||
│ └── sample_request/ | ||
│ └── request.json | ||
├── etl_tutorial/ | ||
│ └── assets.py | ||
│ └── definitions.py | ||
│ └── partitions.py | ||
│ └── schedules.py | ||
│ └── sensors.py | ||
├── pyproject.toml | ||
├── setup.cfg | ||
├── setup.py | ||
``` | ||
|
||
### Assets | ||
|
||
Assets make up a majority of our project and this will be our largest file. | ||
|
||
<CodeExample filePath="guides/tutorials/etl_tutorial_completed/etl_tutorial/assets.py" language="python" lineStart="1" lineEnd="292"/> | ||
|
||
### Schedules | ||
|
||
The schedules file will only contain the `weekly_update_schedule`. | ||
|
||
<CodeExample filePath="guides/tutorials/etl_tutorial_completed/etl_tutorial/schedules.py" language="python" lineStart="1" lineEnd="8"/> | ||
|
||
### Sensors | ||
|
||
The sensors file will have the job and sensor for the `adhoc_request` asset. | ||
|
||
<CodeExample filePath="guides/tutorials/etl_tutorial_completed/etl_tutorial/sensors.py" language="python" lineStart="1" lineEnd="47"/> | ||
|
||
## Adjusting definitions object | ||
|
||
Now that we have separate files we need to adjust how the different elements are adding to definitions since they are in separate files | ||
|
||
1. Imports | ||
|
||
The Dagster project runs from the root directory so whenever you are doing file references you need to have that as the starting point. | ||
|
||
Additionally, Dagster has functions to load all the assets `load_assets_from_modules` and asset checks `load_asset_checks_from_modules` from a module. | ||
|
||
2. Definitions | ||
|
||
To bring our project together copy the following code into your `definitions.py` file: | ||
|
||
<CodeExample filePath="guides/tutorials/etl_tutorial_completed/etl_tutorial/definitions.py" language="python" lineStart="1" lineEnd="19"/> | ||
|
||
## Quick Validation | ||
|
||
If you want to validate that your definitions file loads and validates you can run the `dagster definitions validate` in the same directory that you would run `dagster dev`. This command is useful for CI/CD pipelines and allows you to check that your project loads correctly without starting the webserver. | ||
|
||
## Thats it! | ||
Check failure on line 79 in docs/docs-beta/docs/tutorial/etl-tutorial/08-refactoring-the-project.md GitHub Actions / runner / vale
Check failure on line 79 in docs/docs-beta/docs/tutorial/etl-tutorial/08-refactoring-the-project.md GitHub Actions / runner / vale
|
||
|
||
Congratulations! You have completed your first project with Dagster and have an example of how to use the building blocks to build your own data pipelines. | ||
|
||
## Recommended next steps | ||
|
||
- Join our [Slack community](https://dagster.io/slack). | ||
- Continue learning with [Dagster University](https://courses.dagster.io/) courses. | ||
- Start a [free trial of Dagster+](https://dagster.cloud/signup) for your own project. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters