-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs] [revamp] - Seed Concept pages #23602
Merged
Merged
Changes from 3 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
8419e94
Add Concepts
erinkcochran87 79d978a
Remove content to fix weird build error
erinkcochran87 7f93619
Finish Concepts
erinkcochran87 3aa28aa
update sidebar positions
PedramNavid 05a2acb
revert quick-start
PedramNavid 8fb8d5f
try updating workflow
PedramNavid cf504f9
fix concurrency bug
PedramNavid File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Asset checks" | ||
sidebar_position: 7 | ||
--- | ||
|
||
# Asset checks |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Asset dependencies" | ||
sidebar_position: 3 | ||
--- | ||
|
||
# Asset dependencies |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Asset materialization" | ||
sidebar_position: 2 | ||
--- | ||
|
||
# Asset materialization |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Asset metadata" | ||
sidebar_position: 4 | ||
--- | ||
|
||
# Asset metadata |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Thinking in assets" | ||
sidebar_position: 1 | ||
--- | ||
|
||
# Thinking in assets |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# Automation |
6 changes: 6 additions & 0 deletions
6
docs/docs-next/docs/concepts/automation/declarative-automation.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Declarative Automation" | ||
sidebar_position: 1 | ||
--- | ||
|
||
# Declarative Automation |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Schedules" | ||
sidebar_position: 1 | ||
--- | ||
|
||
# Schedules |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Sensors" | ||
sidebar_position: 2 | ||
--- | ||
|
||
# Sensors |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# Execution |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Dagster daemon" | ||
sidebar_position: 1 | ||
--- | ||
|
||
# Dagster daemon |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Run coordinators" | ||
sidebar_position: 4 | ||
--- | ||
|
||
# Run coordinators |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Run executors" | ||
sidebar_position: 3 | ||
--- | ||
|
||
# Run executors |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Run launchers" | ||
sidebar_position: 2 | ||
--- | ||
|
||
# Run launchers |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
--- | ||
title: "I/O managers" | ||
--- | ||
|
||
# I/O managers |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# Ops and jobs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Job configuration" | ||
sidebar_position: 1 | ||
--- | ||
|
||
# Job configuration |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: "Ops vs. assets" | ||
sidebar_position: 1 | ||
--- | ||
|
||
# Ops vs. assets |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
--- | ||
title: "Partitions" | ||
--- | ||
|
||
# Partitions |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# Resources |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,171 +1,3 @@ | ||
--- | ||
title: Quickstart | ||
description: Learn how to quickly get up and running with Dagster | ||
last_update: | ||
date: 2024-08-10 | ||
author: Pedram Navid | ||
--- | ||
|
||
# Dagster Tutorial: Building Your First Dagster Project | ||
|
||
Welcome to this hands-on tutorial where you'll learn how to build a basic Extract, Transform, Load (ETL) pipeline using Dagster. By the end of this tutorial, you'll have created a functional pipeline that extracts data from a CSV file and transforms it. | ||
|
||
## What You'll Learn | ||
|
||
- How to set up a basic Dagster project | ||
- How to create Software-Defined Assets (SDAs) for each step of the ETL process | ||
- How to use Dagster's built-in features to monitor and execute your pipeline | ||
|
||
## Prerequisites | ||
|
||
- Basic Python knowledge | ||
- Python 3.7+ installed on your system, see [installation guide](tutorial/installation.md) for more details | ||
|
||
## Step 1: Set Up Your Dagster Environment | ||
|
||
First, set up a new Dagster project. | ||
|
||
1. Open your terminal and create a new directory for your project: | ||
|
||
```bash title="Create a new directory" | ||
mkdir dagster-quickstart | ||
cd dagster-quickstart | ||
``` | ||
|
||
2. Create a virtual environment and activate it: | ||
|
||
```bash title="Create a virtual environment" | ||
python -m venv venv | ||
source venv/bin/activate | ||
# On Windows, use `venv\Scripts\activate` | ||
``` | ||
|
||
3. Install Dagster and the required dependencies: | ||
|
||
```bash title="Install Dagster and dependencies" | ||
pip install dagster dagster-webserver pandas | ||
``` | ||
|
||
## Step 2: Create Your Dagster Project Structure | ||
|
||
Set up a basic project structure: | ||
|
||
:::warning | ||
|
||
The file structure here is simplified to get quickly started. | ||
|
||
Once you've completed this tutorial, consider the [ETL Pipeline Tutorial](/tutorial/tutorial-etl) to learn | ||
how to build more complex pipelines with best practices. | ||
|
||
::: | ||
|
||
1. Create the following files and directories: | ||
|
||
```bash title="Project structure" | ||
dagster-quickstart/ | ||
├── quickstart/ | ||
│ ├── __init__.py | ||
│ └── assets.py | ||
├── data/ | ||
└── sample_data.csv | ||
``` | ||
|
||
```bash title="Create the project structure" | ||
mkdir quickstart data | ||
touch quickstart/__init__.py quickstart/assets.py | ||
touch data/sample_data.csv | ||
``` | ||
|
||
|
||
|
||
2. Create a sample CSV file as a data source. In the `data/sample_data.csv` file, add the following content: | ||
|
||
```csv | ||
id,name,age,city | ||
1,Alice,28,New York | ||
2,Bob,35,San Francisco | ||
3,Charlie,42,Chicago | ||
4,Diana,31,Los Angeles | ||
``` | ||
|
||
## Step 3: Define Your Assets | ||
|
||
Now, create the assets for the ETL pipeline. Open `quickstart/assets.py` and add the following code: | ||
|
||
```python | ||
import pandas as pd | ||
from dagster import asset, Definitions | ||
|
||
@asset | ||
def processed_data(): | ||
df = pd.read_csv("data/sample_data.csv") | ||
df['age_group'] = pd.cut(df['age'], bins=[0, 30, 40, 100], labels=['Young', 'Middle', 'Senior']) | ||
df.to_csv("data/processed_data.csv", index=False) | ||
return "Data loaded successfully" | ||
|
||
defs = Definitions(assets=[processed_data]) | ||
``` | ||
|
||
This code defines a single data asset within a single computation that performs three steps: | ||
- Reads data from the CSV file | ||
- Adds an `age_group` column based on the `age` | ||
- Saves the processed data to a CSV file | ||
|
||
If you are used to task-based orchestrations, this might feel a bit different. | ||
In traditional task-based orchestrations, you would have three separate steps, | ||
but in Dagster, you model your pipelines using assets as the fundamental building block, | ||
rather than tasks. | ||
|
||
The `Definitions` object serves as the central configuration point for a Dagster project. In this code, a `Definitions` | ||
object is defined and the asset is passed to it. This tells Dagster about the assets that make up the ETL pipeline | ||
and allows Dagster to manage their execution and dependencies. | ||
|
||
## Step 4: Run Your Pipeline | ||
|
||
:::warning | ||
|
||
There should be screenshots here!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! | ||
|
||
::: | ||
|
||
1. In the terminal, navigate to your project root directory and run: | ||
|
||
```bash | ||
dagster dev -f quickstart/assets.py | ||
``` | ||
|
||
2. Open your web browser and go to `http://localhost:3000` | ||
|
||
3. You should see the Dagster UI along with the asset. | ||
|
||
3. Click Materialize All to run the pipeline. | ||
|
||
4. In the popup that appears, click View to view a run as it executes. | ||
|
||
5. Watch as Dagster executes your pipeline. Try different views by selecting the different view buttons in the top-left. | ||
You can click on each asset to see its logs and metadata. | ||
|
||
## Step 5: Verify Your Results | ||
|
||
To verify that your pipeline worked correctly: | ||
|
||
1. In your terminal, run: | ||
|
||
```bash | ||
cat data/processed_data.csv | ||
``` | ||
|
||
You should see your transformed data, including the new `age_group` column. | ||
|
||
## What You've Learned | ||
|
||
Congratulations! You've just built and run your first pipeline with Dagster. You've learned how to: | ||
|
||
- Set up a Dagster project | ||
- Define Software-Defined Assets for each step of your pipeline | ||
- Use Dagster's UI to run and monitor your pipeline | ||
|
||
## Next Steps | ||
|
||
- Continue with the [ETL Pipeline Tutorial](/tutorial/tutorial-etl) to learn how to build a more complex ETL pipeline | ||
- Learn how to [Think in Assets](/concepts/thinking-in-assets) | ||
# Quickstart | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. did you mean to delete this one |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually do these in increments of 10. That way when you decide you want to add something between 7 and 8, you don't have to go and change everything above and below but you just add one called 71.