From e5abc33f101b88ff28b2019472a4a3205a9b8267 Mon Sep 17 00:00:00 2001 From: Erin Cochran Date: Wed, 18 Oct 2023 11:54:49 -0400 Subject: [PATCH 1/2] Fix file name formats --- .../lesson-2/project-files.md | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/docs/dagster-university/pages/dagster-essentials/lesson-2/project-files.md b/docs/dagster-university/pages/dagster-essentials/lesson-2/project-files.md index c691bf9d58555..763d2f1088c24 100644 --- a/docs/dagster-university/pages/dagster-essentials/lesson-2/project-files.md +++ b/docs/dagster-university/pages/dagster-essentials/lesson-2/project-files.md @@ -56,13 +56,13 @@ The columns in the following table are as follows: --- -- **README.md** +- `README.md` - Python - A description and starter guide for the Dagster project. --- -- **dagster_university/** +- `dagster_university/` - Dagster - A Python module that will contain your Dagster code. This directory also contains the following: - `__init__.py` - This file includes a `Definitions` object that defines that is loaded in your project, such as assets and sensors. This allows Dagster to load the definitions in a module. We’ll discuss this topic, and this file, later in this course. @@ -70,49 +70,49 @@ The columns in the following table are as follows: --- -- **dagster_university/**init**.py** +- `dagster_university/__init__.py` - Dagster - Each Python module has an `__init__.py`. This root-level `__init__.py` is specifically used to import and combine the different aspects of your Dagster project. This is called defining your Code Location. You’ll learn more about this in a future lesson. --- -- **dagster_university/assets/constants.py** +- `dagster_university/assets/constants.py` - Dagster U - A pre-made file with some string constants that you’ll reference for convenience. --- -- **dagster_university_tests/** +- `dagster_university_tests/` - Dagster - A Python module that contains unit tests for `dagster_university` --- -- **data/** +- `data/` - Dagster U - This directory (and directories within it) is where you’ll store the data assets you’ll make during this course. In production settings, this could be Amazon S3 or a data warehouse. --- -- **.env** +- `.env` - Python - A text file containing pre-configured environment variables. We’ll talk more about this file in Lesson 6, when we cover connecting to external services. --- -- **pyproject.toml** +- `pyproject.toml` - Python - A file that specifies package core metadata in a static, tool-agnostic way. This file includes a `tool.dagster` section which references the Python module with your Dagster definitions defined and discoverable at the top level. This allows you to use the `dagster dev` command to load your Dagster code without any parameters. --- -- **setup.py** +- `setup.py` - Python - A build script with Python package dependencies for your new project as a package. This file is used to specify dependencies. --- -- **setup.cfg** +- `setup.cfg` - Python - A file that contains option defaults for `setup.py` commands. From 3f4405a59940d061379ee4ba9e1e37d60ba48e21 Mon Sep 17 00:00:00 2001 From: Erin Cochran Date: Wed, 18 Oct 2023 12:28:39 -0400 Subject: [PATCH 2/2] Put drop table in callout --- .../lesson-8/coding-practice-partition-taxi-trips.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/docs/dagster-university/pages/dagster-essentials/lesson-8/coding-practice-partition-taxi-trips.md b/docs/dagster-university/pages/dagster-essentials/lesson-8/coding-practice-partition-taxi-trips.md index cfa5d66bf69b9..909c58db1eee3 100644 --- a/docs/dagster-university/pages/dagster-essentials/lesson-8/coding-practice-partition-taxi-trips.md +++ b/docs/dagster-university/pages/dagster-essentials/lesson-8/coding-practice-partition-taxi-trips.md @@ -12,19 +12,23 @@ To practice what you’ve learned, partition the `taxi_trips` asset by month usi - With every partition, insert the new data into the `taxi_trips` table -- For convenience, add a `partition_date` column to represent which partition the record was inserted from. You’ll need to drop the existing `taxi_trips` because of the new `partition_date` column. In a Python REPL or scratch script, run the following: +- For convenience, add a `partition_date` column to represent which partition the record was inserted from. + + {% callout %} + You’ll need to drop the existing `taxi_trips` because of the new `partition_date` column. In a Python REPL or scratch script, run the following: ```yaml import duckdb conn = duckdb.connect(database="data/staging/data.duckdb") conn.execute("drop table trips;") ``` + {% /callout %} - Because the `taxi_trips` table will exist after the first partition materializes, the SQL query will have to change - In this asset, you’ll need to do three actions: - Create the `taxi_trips` table if it doesn’t already exist - - Delete any old data from that `partition_date` to prevent duplicates when backfilling + - Delete any old data from `partition_date` to prevent duplicates when backfilling - Insert new records from the month’s parquet file ---