Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dagster-u] - Fix typos and some code examples #18539

Merged
merged 3 commits into from
Dec 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ def taxi_trips_file():

Using the `month_to_fetch` variable, the URL to retrieve the file from becomes: `https://.../trip-data/yellow_tripdata_2023-03.parquet`

4. Next, the path of the file will be stored at is constructed. The value of `TAXI_TRIPS_TEMPLATE_FILE_PATH`, stored in your project’s `assets/constants.py` file, is retrieved: `data/raw/taxi_trips_{}.parquet`
4. Next, the path where the file will be stored is constructed. The value of `TAXI_TRIPS_TEMPLATE_FILE_PATH`, stored in your project’s `assets/constants.py` file, is retrieved: `data/raw/taxi_trips_{}.parquet`

5. The parquet file is created and saved at `data/raw/taxi_trips_2023-03.parquet`

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,6 @@ def taxi_zones_file():
"https://data.cityofnewyork.us/api/views/755u-8jsi/rows.csv?accessType=DOWNLOAD"
)

with open("data/raw/taxi_zones.csv", "wb") as output_file:
with open(constants.TAXI_ZONES_FILE_PATH, "wb") as output_file:
output_file.write(raw_taxi_zones.content)
```
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,7 @@ Update the imports in `assets/metrics.py` to the following:

```python {% obfuscated="true" %}
import requests
from dagster_duckdb
import DuckDBResource
from dagster_duckdb import DuckDBResource
from . import constants
from dagster import asset
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ lesson: '6'

In previous lessons, you learned about assets, how to connect assets to represent a data pipeline, and how to start a run that materializes the assets.

Dagster’s role is to be the single pane of glass across all data pipelines in an organization. To do make this possible, Dagster needs to know about the services and systems used in your data pipelines, such as cloud storage or a data warehouse. In this lesson, we’ll show you how to accomplish this using software engineering best practices.
Dagster’s role is to be the single pane of glass across all data pipelines in an organization. To make this possible, Dagster needs to know about the services and systems used in your data pipelines, such as cloud storage or a data warehouse. In this lesson, we’ll show you how to accomplish this using software engineering best practices.

With this in mind, the best practice we’ll focus on in this lesson is called **Don’t Repeat Yourself**, or **DRY** for short. This principle recommends that engineers do something once and only once, thereby reducing duplication and redundancy. By being intentional and writing DRY code, you can reduce the number of bugs, increase the ability to understand the project’s codebase, and improve observability over how logic and functionality are used.

Expand Down
Loading