Skip to content

Commit

Permalink
page link fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
C00ldudeNoonan committed Dec 27, 2024
1 parent aae8195 commit 1eb255c
Show file tree
Hide file tree
Showing 9 changed files with 19 additions and 25 deletions.
2 changes: 1 addition & 1 deletion docs/docs-beta/docs/getting-started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,5 +153,5 @@ id,name,age,city,age_group

Congratulations! You've just built and run your first pipeline with Dagster. Next, you can:

- Continue with the [ETL pipeline tutorial](/tutorial/tutorial-etl) to learn how to build a more complex ETL pipeline
- Continue with the [ETL pipeline tutorial](/tutorial/etl-tutorial/etl-tutorial-introduction) to learn how to build a more complex ETL pipeline
- Learn how to [Think in assets](/guides/build/assets-concepts/index.md)
Original file line number Diff line number Diff line change
Expand Up @@ -130,9 +130,7 @@ To make sure Dagster and its dependencies were installed correctly, navigate to

followed by a bash code snippet for `dagster dev`

[screenshot of ui]


## Next steps

- Continue this tutorial by [creating and materializing assets](/tutorial/02-create-and-materialize-assets)
- Continue this tutorial by [creating and materializing assets](/tutorial/etl-tutorial/02-create-and-materialize-assets)
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Open the `definitions.py` file in the `etl_tutorial` directory and copy the foll

## 2. Define the DuckDB resource

In Dagster, [Resources](/api/resources) are the external services, tools, and storage backends you need to do your job. For the storage backend in this project, we'll use [DuckDB](https://duckdb.org/), a fast, in-process SQL database that runs inside your application. We'll define it once in the definitions object, making it available to all assets and objects that need it.
In Dagster, [Resources API docs](/todo) are the external services, tools, and storage backends you need to do your job. For the storage backend in this project, we'll use [DuckDB](https://duckdb.org/), a fast, in-process SQL database that runs inside your application. We'll define it once in the definitions object, making it available to all assets and objects that need it.

```python
defs = dg.Definitions(
Expand All @@ -46,7 +46,7 @@ In Dagster, [Resources](/api/resources) are the external services, tools, and st

## 3. Create assets

Software defined [assets](/api/assets) are the main building blocks in Dagster. An asset is composed of three components:
Software defined [assets API docs](/todo) are the main building blocks in Dagster. An asset is composed of three components:
1. Asset key or unique identifier.
2. An op which is a function that is invoked to produce the asset.
3. Upstream dependencies that the asset depends on.
Expand Down Expand Up @@ -105,4 +105,4 @@ To materialize your assets:

## Next steps

- Continue this tutorial with your with your [asset dependencies](/tutorial/03-creating-a-downstream-asset)
- Continue this tutorial with your with your [asset dependencies](/tutorial/etl-tutorial/03-creating-a-downstream-asset)
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ last_update:

# Asset Dependencies

Now that we have the raw data loaded into DuckDB, we need to create a [downstream asset](guides/build/asset-concepts/asset-dependencies) that combines the upstream assets together. In this step, you will:
Now that we have the raw data loaded into DuckDB, we need to create a [downstream asset](/guides/build/assets-concepts/asset-dependencies) that combines the upstream assets together. In this step, you will:

- Create a downstream asset
- Materialize that asset
Expand Down Expand Up @@ -43,4 +43,4 @@ Your Definitions object should now look like this:

## Next steps

- Continue this tutorial with [create and materialize a partitioned asset](/tutorial/05-ensuring-data-quality-with-asset-checks)
- Continue this tutorial with [create and materialize a partitioned asset](/tutorial/etl-tutorial/05-ensuring-data-quality-with-asset-checks)
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ last_update:

Data Quality is critical in data pipelines. Much like in a factory producing cars, inspecting parts after they complete certain steps ensures that defects are caught before the car is completely assembled.

In Dagster, you define [asset checks](guides/build/test/asset-checks) in a similar way that you would define an Asset. In this step you will:
In Dagster, you define [asset checks](/guides/test/asset-checks) in a similar way that you would define an Asset. In this step you will:

- Define an asset check
- Execute that asset check in the UI
Expand Down Expand Up @@ -50,4 +50,4 @@ Asset checks will run when an asset is materialized, but asset checks can also b

## Next steps

- Continue this tutorial with [Asset Checks](/tutorial/04-ensuring-data-quality-with-asset-checks)
- Continue this tutorial with [Asset Checks](/tutorial/etl-tutorial/04-ensuring-data-quality-with-asset-checks)
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ last_update:

# Partitions

[Partitions](guides/partitioning) are a core abstraction in Dagster, they are how you manage large datasets, process incremental updates, and improve pipeline performance. In Dagster you can partition assets the following ways:
[Partitions](/guides/create-a-pipeline/partitioning) are a core abstraction in Dagster, they are how you manage large datasets, process incremental updates, and improve pipeline performance. In Dagster you can partition assets the following ways:

1. Time-based: Split data by time periods (e.g., daily, monthly)

Check warning on line 13 in docs/docs-beta/docs/tutorial/etl-tutorial/05-create-and-materialize-partitioned-asset.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Dagster.latin] Use 'for example' instead of 'e.g.', but consider rewriting the sentence. Raw Output: {"message": "[Dagster.latin] Use 'for example' instead of 'e.g.', but consider rewriting the sentence.", "location": {"path": "docs/docs-beta/docs/tutorial/etl-tutorial/05-create-and-materialize-partitioned-asset.md", "range": {"start": {"line": 13, "column": 44}}}, "severity": "WARNING"}
2. Category-based: Divide by known categories (e.g., country, product type)

Check warning on line 14 in docs/docs-beta/docs/tutorial/etl-tutorial/05-create-and-materialize-partitioned-asset.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Dagster.latin] Use 'for example' instead of 'e.g.', but consider rewriting the sentence. Raw Output: {"message": "[Dagster.latin] Use 'for example' instead of 'e.g.', but consider rewriting the sentence.", "location": {"path": "docs/docs-beta/docs/tutorial/etl-tutorial/05-create-and-materialize-partitioned-asset.md", "range": {"start": {"line": 14, "column": 48}}}, "severity": "WARNING"}
Expand Down Expand Up @@ -177,4 +177,4 @@ To materialize these assets :

## Next Steps

Now that we have the main assets in our ETL pipeline, its time to add [automation to our pipeline](tutorial/06-automating-your-pipeline)
Now that we have the main assets in our ETL pipeline, its time to add [automation to our pipeline](/tutorial/etl-tutorial/06-automating-your-pipeline)
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ last_update:

# Automation

There are several ways to automate pipelines and assets [in Dagster](guides/automation).
There are several ways to automate pipelines and assets [in Dagster](/guides/automate).

In this step you will:

Expand All @@ -16,23 +16,23 @@ In this step you will:

## 1. Automating asset materialization

Ideally, the reporting assets created in the last step should refresh whenever the upstream data is updated. This can be done simply using [declarative automation](guides/declarative-automation) and adding an automation condition to the asset definition.
Ideally, the reporting assets created in the last step should refresh whenever the upstream data is updated. This can be done simply using [declarative automation](/guides/automate/declarative-automation) and adding an automation condition to the asset definition.

Check failure on line 19 in docs/docs-beta/docs/tutorial/etl-tutorial/06-automating-your-pipeline.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Avoid] Avoid using 'simply'. Raw Output: {"message": "[Vale.Avoid] Avoid using 'simply'.", "location": {"path": "docs/docs-beta/docs/tutorial/etl-tutorial/06-automating-your-pipeline.md", "range": {"start": {"line": 19, "column": 127}}}, "severity": "ERROR"}

Update the `monthly_sales_performance` asset to have the automation condition in the decorator:

<CodeExample filePath="guides/tutorials/etl_tutorial/etl_tutorial/definitions.py" language="python" lineStart="155" lineEnd="209"/>

Do the same thing for `product_performance`:

<CodeExample filePath="guides/tutorials/etl_tutorial/etl_tutorial/definitions.py" language="python" lineStart="217" lineEnd="267"/>
<CodeExample filePath="guides/tutorials/etl_tutorial/etl_tutorial/definitions.py" language="python" lineStart="216" lineEnd="267"/>

## 2. Scheduled Jobs

CRON based schedules are common in data orchestration. For our pipeline, assume updated csv's get dropped into a file location every week at a specified time by an external process. We want to have a job that runs the pipeline and materialize the asset. Since we already defined the performance assets to materialize using the eager condition, When the upstream data is updated the entire pipeline will refresh.

Copy the following code underneath the `product performance` asset:

<CodeExample filePath="guides/tutorials/etl_tutorial/etl_tutorial/definitions.py" language="python" lineStart="267" lineEnd="273"/>
<CodeExample filePath="guides/tutorials/etl_tutorial/etl_tutorial/definitions.py" language="python" lineStart="268" lineEnd="273"/>

## 3. Running the entire pipeline

Expand All @@ -54,4 +54,4 @@ Additionally if you navigate to the runs tab you will see that materializations

## Next steps

- Continue this tutorial with adding a [sensor based asset](/tutorial/07-creating-a-sensor-asset)
- Continue this tutorial with adding a [sensor based asset](/tutorial/etl-tutorial/07-creating-a-sensor-asset)
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,7 @@ last_update:
author: Alex Noonan
---

# Sensors

[Sensors](guides/sensors) in Dagster are a powerful tool for automating workflows based on external events or conditions. They allow you to trigger jobs when specific criteria are met, making them essential for event-driven automation.
[Sensors](/guides/automate/sensors) in Dagster are a powerful tool for automating workflows based on external events or conditions. They allow you to trigger jobs when specific criteria are met, making them essential for event-driven automation.

Event driven automations to support situations where jobs occur at irregular cadences or in rapid succession. is the building block in Dagster you can use to support this.

Expand All @@ -23,7 +21,7 @@ In this step you will:

## 1. Event Driven Asset

For our pipeline, we want to model a situation where an executive wants a pivot table report of sales results by department and product. They want that processed in real time from their request and it isnt a high priority to build the reporting to have this available and refreshing.
For our pipeline, we want to model a situation where an executive wants a pivot table report of sales results by department and product. They want that processed in real time from their request and it isn't a high priority to build the reporting to have this available and refreshing.

For this asset we need to define the structure of the request that it is expecting in the materialization context.

Expand Down Expand Up @@ -70,4 +68,4 @@ sensors include the following elements:

Now that we have our complete project, the next step is to refactor the project into more a more manageable structure so we can add to it as needed.

Finish the tutorial with [refactoring the project](tutorial/refactoring-the-project)
Finish the tutorial with [refactoring the project](/tutorial/etl-tutorial/08-refactoring-the-project)
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,6 @@ last_update:
author: Alex Noonan
---

# Refactoring code

Many engineers generally leave something alone once its working as expected. But the first time you do something is rarely the best implementation of a use case and all projects benefit from incremental improvements.

## Splitting up project structure
Expand Down

0 comments on commit 1eb255c

Please sign in to comment.