Skip to content

Commit

Permalink
Merge branch 'master' into erin/doc-162-subset-asset-checks
Browse files Browse the repository at this point in the history
  • Loading branch information
erinkcochran87 committed Apr 25, 2024
2 parents 689438d + ed9c92a commit ba812de
Show file tree
Hide file tree
Showing 590 changed files with 16,899 additions and 11,617 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -381,6 +381,8 @@ def _unsupported_dagster_python_versions(tox_factor: Optional[str]) -> List[Avai
"cli_tests",
"core_tests_pydantic1",
"core_tests_pydantic2",
"model_tests_pydantic1",
"model_tests_pydantic2",
"storage_tests_sqlalchemy_1_3",
"storage_tests_sqlalchemy_1_4",
"daemon_sensor_tests",
Expand Down
3 changes: 2 additions & 1 deletion .buildkite/dagster-buildkite/dagster_buildkite/steps/tox.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,8 @@ def build_tox_step(
commands = [
*(extra_commands_pre or []),
f"cd {root_dir}",
"pip install -U uv",
# 0.1.33 is installing old versions of libraries. Remove this pin once this is fixed.
'pip install --force-reinstall "uv==0.1.32"',
f"echo -e {shlex.quote(buildkite_section_header)}",
tox_command,
*(extra_commands_post or []),
Expand Down
1 change: 0 additions & 1 deletion .github/CODEOWNERS

This file was deleted.

5 changes: 4 additions & 1 deletion .github/workflows/build-dagster-university.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,11 @@ jobs:
# Deploy to Vercel Previews on pull request
- name: Get branch preview subdomain
if: github.event_name == 'pull_request'
env:
HEAD_REF: ${{ github.head_ref }}
REF_NAME: ${{ github.ref_name }}
run: |
BRANCH_PREVIEW_SUBDOMAIN=$(echo "${{ github.head_ref || github.ref_name }}" | sed 's/[^a-zA-Z0-9-]/-/g' | sed 's/^-*//' | sed 's/-*$//')
BRANCH_PREVIEW_SUBDOMAIN=$(echo "${HEAD_REF:-$REF_NAME}" | sed -e 's/[^a-zA-Z0-9-]/-/g; s/^-*//; s/-*$//')
echo "$BRANCH_PREVIEW_SUBDOMAIN"
echo "BRANCH_PREVIEW_SUBDOMAIN=$BRANCH_PREVIEW_SUBDOMAIN" >> "${GITHUB_ENV}"
Expand Down
5 changes: 4 additions & 1 deletion .github/workflows/build-docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,14 @@ jobs:
steps:
# Deploy to Vercel Previews on pull request, push to master branch, or push to release-* branch
- name: Get branch preview subdomain
env:
HEAD_REF: ${{ github.head_ref }}
REF_NAME: ${{ github.ref_name }}
if: |
github.event_name == 'pull_request' ||
(github.event_name == 'push' && (github.ref == 'refs/heads/master' || startsWith(github.ref, 'refs/heads/release-')))
run: |
BRANCH_PREVIEW_SUBDOMAIN=$(echo "${{ github.head_ref || github.ref_name }}" | sed 's/[^a-zA-Z0-9-]/-/g' | sed 's/^-*//' | sed 's/-*$//')
BRANCH_PREVIEW_SUBDOMAIN=$(echo "${HEAD_REF:-$REF_NAME}" | sed -e 's/[^a-zA-Z0-9-]/-/g; s/^-*//; s/-*$//')
echo "$BRANCH_PREVIEW_SUBDOMAIN"
echo "BRANCH_PREVIEW_SUBDOMAIN=$BRANCH_PREVIEW_SUBDOMAIN" >> "${GITHUB_ENV}"
Expand Down
5 changes: 4 additions & 1 deletion .github/workflows/build-storybook-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,11 @@ jobs:
steps:
- name: Get branch preview subdomain
if: github.event_name == 'pull_request'
env:
HEAD_REF: ${{ github.head_ref }}
REF_NAME: ${{ github.ref_name }}
run: |
BRANCH_PREVIEW_SUBDOMAIN=$(echo "${{ github.head_ref || github.ref_name }}" | sed 's/[^a-zA-Z0-9-]/-/g' | sed 's/^-*//' | sed 's/-*$//')
BRANCH_PREVIEW_SUBDOMAIN=$(echo "${HEAD_REF:-$REF_NAME}" | sed -e 's/[^a-zA-Z0-9-]/-/g; s/^-*//; s/-*$//')
echo "$BRANCH_PREVIEW_SUBDOMAIN"
echo "BRANCH_PREVIEW_SUBDOMAIN=$BRANCH_PREVIEW_SUBDOMAIN" >> "${GITHUB_ENV}"
- uses: actions/checkout@v4
Expand Down
5 changes: 4 additions & 1 deletion .github/workflows/build-storybook-ui.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,11 @@ jobs:
steps:
- name: Get branch preview subdomain
if: github.event_name == 'pull_request'
env:
HEAD_REF: ${{ github.head_ref }}
REF_NAME: ${{ github.ref_name }}
run: |
BRANCH_PREVIEW_SUBDOMAIN=$(echo "${{ github.head_ref || github.ref_name }}" | sed 's/[^a-zA-Z0-9-]/-/g' | sed 's/^-*//' | sed 's/-*$//')
BRANCH_PREVIEW_SUBDOMAIN=$(echo "${HEAD_REF:-$REF_NAME}" | sed -e 's/[^a-zA-Z0-9-]/-/g; s/^-*//; s/-*$//')
echo "$BRANCH_PREVIEW_SUBDOMAIN"
echo "BRANCH_PREVIEW_SUBDOMAIN=$BRANCH_PREVIEW_SUBDOMAIN" >> "${GITHUB_ENV}"
- uses: actions/checkout@v4
Expand Down
80 changes: 77 additions & 3 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,79 @@
# Changelog

# 1.7.2 (core) / 0.23.2 (libraries)

### New

- Performance improvements when loading large asset graphs in the Dagster UI.
- `@asset_check` functions can now be invoked directly for unit testing.
- `dagster-embedded-elt` dlt resource `DagsterDltResource` can now be used from `@op` definitions in addition to assets.
- `UPathIOManager.load_partitions` has been added to assist with helping `UpathIOManager` subclasses deal with serialization formats which support partitioning. Thanks `@danielgafni`!
- [dagster-polars] now supports other data types rather than only string for the partitioning columns. Also `PolarsDeltaIOManager` now supports `MultiPartitionsDefinition` with `DeltaLake` native partitioning. Metadata value `"partition_by": {"dim_1": "col_1", "dim_2": "col_2"}` should be specified to enable this feature. Thanks `@danielgafni`!

### Bugfixes

- [dagster-airbyte] Auto materialization policies passed to `load_assets_from_airbyte_instance` and `load_assets_from_airbyte_project` will now be properly propagated to the created assets.
- Fixed an issue where deleting a run that was intended to materialize a partitioned asset would sometimes leave the status of that asset as “Materializing” in the Dagster UI.
- Fixed an issue with `build_time_partition_freshness_checks` where it would incorrectly intuit that an asset was not fresh in certain cases.
- [dagster-k8s] Fix an error on transient ‘none’ responses for pod waiting reasons. Thanks @**[piotrmarczydlo](https://github.com/piotrmarczydlo)!**
- [dagster-dbt] Failing to build column schema metadata will now result in a warning rather than an error.
- Fixed an issue where incorrect asset keys would cause a backfill to fail loudly.
- Fixed an issue where syncing unmaterialized assets could include source assets.

### Breaking Changes

- [dagster-polars] `PolarsDeltaIOManager` no longer supports loading natively partitioned DeltaLake tables as dictionaries. They should be loaded as a single `pl.DataFrame`/`pl.LazyFrame` instead.

### Documentation

- Renamed `Dagster Cloud` to `Dagster+` all over the docs.
- Added a page about [Change Tracking](https://docs.dagster.io/dagster-plus/managing-deployments/branch-deployments/change-tracking) in Dagster+ branch deployments.
- Added a section about [user-defined metrics](https://docs.dagster.io/concepts/metadata-tags/asset-metadata#asset-owners) to the Dagster+ Insights docs.
- Added a section about [Asset owners](https://docs.dagster.io/concepts/metadata-tags/asset-metadata#asset-owners) to the asset metadata docs.

### Dagster Cloud

- Branch deployments now have Change Tracking. Assets in each branch deployment will be compared to the main deployment. New assets and changes to code version, dependencies, partitions definitions, tags, and metadata will be marked in the UI of the branch deployment.
- Pagerduty alerting is now supported with Pro plans. See the [documentation](https://docs.dagster.io/dagster-cloud/managing-deployments/alerts/pagerduty) for more info.
- Asset metadata is now included in the insights metrics for jobs materializing those assets.
- Per-run Insights are now available on individual assets.
- Previously, the `before_storage_id` / `after_storage_id` values in the `AssetRecordsFilter` class were ignored. This has been fixed.
- Updated the output of `dagster-cloud deployment alert-policies list` to match the format of `sync`.
- Fixed an issue where Dagster Cloud agents with many code locations would sometimes leave code servers running after the agent shut down.

# 1.7.1 (core) / 0.23.1 (libraries)

### New

- [dagster-dbt][experimental] A new cli command `dagster-dbt project prepare-for-deployment` has been added in conjunction with `DbtProject` for managing the behavior of rebuilding the manifest during development and preparing a pre-built one for production.

### Bugfixes

- Fixed an issue with duplicate asset check keys when loading checks from a package.
- A bug with the new `build_last_update_freshness_checks` and `build_time_partition_freshness_checks` has been fixed where multi_asset checks passed in would not be executable.
- [dagster-dbt] Fixed some issues with building column lineage for incremental models, models with implicit column aliases, and models with columns that have multiple dependencies on the same upstream column.

### Breaking Changes

- [dagster-dbt] The experimental `DbtArtifacts` class has been replaced by `DbtProject`.

### Documentation

- Added a dedicated concept page for all things [metadata and tags](https://docs.dagster.io/concepts/metadata-tags)
- Moved asset metadata content to a dedicated concept page: [Asset metadata](https://docs.dagster.io/concepts/metadata-tags/asset-metadata)
- Added section headings to the [Software-defined Assets API reference](https://docs.dagster.io/_apidocs/assets), which groups APIs by asset type or use
- Added a guide about [user settings in the Dagster UI](https://docs.dagster.io/concepts/webserver/ui-user-settings)
- Added `AssetObservation` to the Software-defined Assets API reference
- Renamed Dagster Cloud GitHub workflow files to the new, consolidated `dagster-cloud-deploy.yml`
- Miscellaneous formatting and copy updates
- [community-contribution] [dagster-embedded-elt] Fixed `get_asset_key` API documentation (thanks @aksestok!)
- [community-contribution] Updated Python version in contributing documentation (thanks @piotrmarczydlo!)
- [community-contribution] Typo fix in README (thanks @MiConnell!)

### Dagster Cloud

- Fixed a bug where an incorrect value was being emitted for BigQuery bytes billed in Insights.

# 1.7.0 (core) / 0.23.0 (libraries)

## Major Changes since 1.6.0 (core) / 0.22.0 (libraries)
Expand All @@ -15,7 +89,7 @@
- In the asset graph UI, the “Upstream data”, “Code version changed”, and “Upstream code version” statuses have been collapsed into a single “Unsynced” status. Clicking on “Unsynced” displays more detailed information.
- I/O managers are now optional. This enhances flexibility for scenarios where they are not necessary. For guidance, see [When to use I/O managers](https://docs.dagster.io/concepts/io-management/io-managers#when-to-use-io-managers).
- Assets with `None` or `MaterializeResult` return type annotations won't use I/O managers; dependencies for these assets can be set using the `deps` parameter in the `@asset` decorator.
- [dagster-dbt] Dagster’s dbt integration can now be configured to automatically collect [metadata about column schema and column lineage](https://www.notion.so/https-docs-dagster-io-integrations-snowflake-3ec05f9d3677469b8c0ceec1d6cf6a45?pvs=21).
- [dagster-dbt] Dagster’s dbt integration can now be configured to automatically collect [metadata about column schema and column lineage](https://docs.dagster.io/integrations/dbt/reference#emit-column-level-metadata-as-materialization-metadata-).
- [dagster-dbt] dbt tests are now pulled in as Dagster asset checks by default.
- [dagster-dbt] dbt resource tags are now automatically pulled in as Dagster asset tags.
- [dagster-snowflake] [dagster-gcp] The dagster-snowflake and dagster-gcp packages now both expose a `fetch_last_updated_timestamps` API, which makes it straightforward to collect data freshness information in source asset observation functions.
Expand Down Expand Up @@ -3587,14 +3661,14 @@ nux:
### New

- Tags can now be provided to an asset reconciliation sensor and will be applied to all RunRequests returned by the sensor.
- If you don’t explicitly specify a DagsterType on a graph input, but all the inner inputs that the graph input maps to have the same DagsterType, the graph input’s DagsterType will be set to the the DagsterType of the inner inputs.
- If you don’t explicitly specify a DagsterType on a graph input, but all the inner inputs that the graph input maps to have the same DagsterType, the graph input’s DagsterType will be set to the DagsterType of the inner inputs.
- [dagster-airbyte] `load_assets_from_airbyte_project` now caches the project data generated at repo load time so it does not have to be regenerated in subprocesses.
- [dagster-airbyte] Output table schema metadata is now generated at asset definition time when using `load_assets_from_airbyte_instance` or `load_assets_from_airbyte_project`.
- [dagit] The run timeline now groups all jobs by repository. You can collapse or expand each repository in this view by clicking the repository name. This state will be preserved locally. You can also hold `Shift` while clicking the repository name, and all repository groups will be collapsed or expanded accordingly.
- [dagit] In the launchpad view, a “Remove all” button is now available once you have accrued three or more tabs for that job, to make it easier to clear stale configuration tabs from view.
- [dagit] When scrolling through the asset catalog, the toolbar is now sticky. This makes it simpler to select multiple assets and materialize them without requiring you to scroll back to the top of the page.
- [dagit] A “Materialize” option has been added to the action menu on individual rows in the asset catalog view.
- [dagster-aws] The `EcsRunLauncher` now allows you to pass in a dictionary in the `task_definition` config field that specifies configuration for the task definition of the launched run, including role ARNs and a list of sidecar containers to include. Previously, the task definition could only be configured by passing in a task definition ARN or by basing the the task definition off of the task definition of the ECS task launching the run. See the [docs](https://docs.dagster.io/_apidocs/libraries/dagster-aws#dagster_aws.ecs.EcsRunLauncher) for the full set of available config.
- [dagster-aws] The `EcsRunLauncher` now allows you to pass in a dictionary in the `task_definition` config field that specifies configuration for the task definition of the launched run, including role ARNs and a list of sidecar containers to include. Previously, the task definition could only be configured by passing in a task definition ARN or by basing the task definition off of the task definition of the ECS task launching the run. See the [docs](https://docs.dagster.io/_apidocs/libraries/dagster-aws#dagster_aws.ecs.EcsRunLauncher) for the full set of available config.

### Bugfixes

Expand Down
28 changes: 15 additions & 13 deletions docs/content/_apidocs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -352,8 +352,8 @@ Dagster also provides a growing set of optional add-on libraries to integrate wi
</tr>
<tr>
<td>
<a href="/_apidocs/libraries/dagster-celery-docker">Celery+Docker</a> (
<code>dagster-celery-docker</code>)
<a href="/_apidocs/libraries/dagster-celery-docker">Celery & Docker</a>{" "}
(<code>dagster-celery-docker</code>)
</td>
<td>
Provides an executor that lets Celery workers execute in Docker
Expand All @@ -362,8 +362,8 @@ Dagster also provides a growing set of optional add-on libraries to integrate wi
</tr>
<tr>
<td>
<a href="/_apidocs/libraries/dagster-celery-k8s">Celery+Kubernetes</a> (
<code>dagster-celery-k8s</code>)
<a href="/_apidocs/libraries/dagster-celery-k8s">Celery & Kubernetes</a>{" "}
(<code>dagster-celery-k8s</code>)
</td>
<td>
{" "}
Expand Down Expand Up @@ -439,21 +439,23 @@ Dagster also provides a growing set of optional add-on libraries to integrate wi
</tr>
<tr>
<td>
<a href="/_apidocs/libraries/dagster-duckdb-pandas">DuckDB+Pandas</a> (
<code>dagster-duckdb-pandas</code>)
<a href="/_apidocs/libraries/dagster-duckdb-pandas">DuckDB & Pandas</a>{" "}
(<code>dagster-duckdb-pandas</code>)
</td>
<td>Provides support for storing Pandas DataFrames in DuckDB.</td>
</tr>
<tr>
<td>
<a href="/_apidocs/libraries/dagster-duckdb-polars">DuckDB+Polars</a> (
<code>dagster-duckdb-polars</code>)
<a href="/_apidocs/libraries/dagster-duckdb-polars">DuckDB & Polars</a>{" "}
(<code>dagster-duckdb-polars</code>)
</td>
<td>Provides support for storing Polars DataFrames in DuckDB.</td>
</tr>
<tr>
<td>
<a href="/_apidocs/libraries/dagster-duckdb-pyspark">DuckDB+PySpark</a>{" "}
<a href="/_apidocs/libraries/dagster-duckdb-pyspark">
DuckDB & PySpark
</a>{" "}
(<code>dagster-duckdb-pyspark</code>)
</td>
<td>Provides support for storing PySpark DataFrames in DuckDB.</td>
Expand Down Expand Up @@ -491,7 +493,7 @@ Dagster also provides a growing set of optional add-on libraries to integrate wi
</tr>
<tr>
<td>
<a href="/_apidocs/libraries/dagster-gcp-pandas">GCP+Pandas</a> (
<a href="/_apidocs/libraries/dagster-gcp-pandas">GCP & Pandas</a> (
<code>dagster-gcp-pandas</code>)
</td>
<td>
Expand All @@ -501,7 +503,7 @@ Dagster also provides a growing set of optional add-on libraries to integrate wi
</tr>
<tr>
<td>
<a href="/_apidocs/libraries/dagster-gcp-pyspark">GCP+PySpark</a> (
<a href="/_apidocs/libraries/dagster-gcp-pyspark">GCP & PySpark</a> (
<code>dagster-gcp-pyspark</code>)
</td>
<td>
Expand Down Expand Up @@ -665,7 +667,7 @@ Dagster also provides a growing set of optional add-on libraries to integrate wi
<tr>
<td>
<a href="/_apidocs/libraries/dagster-snowflake-pandas">
Snowflake+Pandas
Snowflake & Pandas
</a>{" "}
(<code>dagster-snowflake-pandas</code>)
</td>
Expand All @@ -674,7 +676,7 @@ Dagster also provides a growing set of optional add-on libraries to integrate wi
<tr>
<td>
<a href="/_apidocs/libraries/dagster-snowflake-pyspark">
Snowflake+PySpark
Snowflake & PySpark
</a>{" "}
(<code>dagster-snowflake-pyspark</code>)
</td>
Expand Down
Loading

0 comments on commit ba812de

Please sign in to comment.