Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: don't use default dbt selection with asset checks #18117

Conversation

johannkm
Copy link
Contributor

@johannkm johannkm commented Nov 17, 2023

Bug report: https://dagster.slack.com/archives/C04CW71AGBW/p1700237771181059?thread_ts=1697638899.590569&cid=C04CW71AGBW

Synopsis:

  • At definition time, we evaluate @dbt_assets using DBT_INDIRECT_SELECTION=eager [side note, I'm kind of surprised this isn't configurable].
  • At runtime, if you're using checks, we set DBT_INDIRECT_SELECTION=empty so that we have the option to exclude checks if they're not in the selection
  • If we are in a subset, this works because we explicitly pass the selected models and tests
  • However, if we determine we're not in a subset, we pass the default selection string, which may not select the tests now that we're not using eager.

Solutions:

  1. [this diff] If asset checks are enabled, always pass an explicit selection of models and tests
  2. The reason we try to use the default selection string when we can is to make things readable from the dbt side. We could try to preserve that by only setting DBT_INDIRECT_SELECTION=empty when we're doing a subset and passing the explicit selection, otherwise use eager and the default selection.

A concern for (2) is relationship tests. Take the case of two models

  • customers, one non null test
  • orders, one relationship test with customers

Dbt will resolve selection customers to the customers model and both the null test and the relationship test. If you're not using asset checks this is fine, we'll emit observations for both tests. If you are using checks, currently @dbt_assets(...select="customers") will only define the null test. This makes sense- we wouldn't even display the relationship test if we ingested it, because we didn't ingest the model its on.

With #1, we won't execute the relationship test (which matches the current behavior). With #2 we will whenever we switch over to eager, which is unexpected because we didn't model it.

Side note: we should document that when you enable checks, relationship tests on non selected models will no longer run

@johannkm
Copy link
Contributor Author

Current dependencies on/for this PR:

This stack of pull requests is managed by Graphite.

@johannkm johannkm force-pushed the johann/11-17-bugfix_don_t_use_default_dbt_selection_with_asset_checks branch 3 times, most recently from aa23fff to bcb5f98 Compare November 17, 2023 19:46
@johannkm johannkm requested review from rexledesma and sryza November 17, 2023 20:17
Copy link
Contributor

@rexledesma rexledesma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some wording suggestions on the logging statements

)
else:
logger.info(
"A dbt subsetted execution is not being performed. Expanding the default dbt selection"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Expanding" is ambiguous here. We need to explain that

  1. We are taking the default selection and just explicitly selecting the models and tests from it
  2. (1) is expected to be equivalent to the default selection, although it is more verbose

@johannkm johannkm force-pushed the johann/11-17-bugfix_don_t_use_default_dbt_selection_with_asset_checks branch from bcb5f98 to 465529f Compare November 27, 2023 17:58
@johannkm johannkm force-pushed the johann/11-17-bugfix_don_t_use_default_dbt_selection_with_asset_checks branch from 465529f to 315727f Compare November 27, 2023 17:59
@johannkm johannkm requested a review from rexledesma November 27, 2023 17:59
@johannkm johannkm merged commit 2e5d013 into master Nov 27, 2023
@johannkm johannkm deleted the johann/11-17-bugfix_don_t_use_default_dbt_selection_with_asset_checks branch November 27, 2023 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants