Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[external-assets] Change build_assets_job interface to use executable_assets and loadable_assets #19765

Merged
merged 2 commits into from
Feb 21, 2024

Conversation

smackesey
Copy link
Collaborator

@smackesey smackesey commented Feb 13, 2024

Summary & Motivation

build_assets_job (an internal API) is the codepath through which all asset job construction flows. Currently, it accepts two lists of assets:

  • assets: Sequence[AssetsDefinition]
  • source_assets: Sequence[Union[AssetsDefinition, SourceAsset]]

This is quite confusing. In particular, source_assets is poorly named. Not only does it contain AssetsDefinition (which gets internally converted to SourceAsset inside the function) but it also contains observable source assets that actually have nodes that will be executed.

This PR changes build_assets_job to accept:

  • executable_assets: Sequence[Union[AssetsDefinition, SourceAsset]]
  • loadable_assets: Sequence[Union[AssetsDefinition, SourceAsset]]

This makes the purpose of these two assets arrays more clear. It can be considered an incremental step in the source asset -> external asset refactor.

The PR leaves the internal logic of build_assets_job unchanged (though it will be changed upstack), passing separate collections of AssetsDefinition and SourceAsset further into the system per the status quo.

How I Tested These Changes

Existing test suite.

A few GQL snapshots are updated due to minor changes in the asset base jobs in the GQL repo. Specifically, the unexecutable external asset is now no longer included in any of the base jobs where previously it was included in all of them (as all unpartitioned assets defs were). This is not a function of change to the build_assets_job logic but rather to the base job construction routine (which is one of the two build_assets_job callsites).

@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from 19f636e to d38dd8d Compare February 13, 2024 15:41
@smackesey smackesey force-pushed the sean/make-asset-execution-type-property branch from 046bc7f to 46799c0 Compare February 13, 2024 15:45
@smackesey smackesey changed the title [external-assets] Change build_assets_job interface to use assets_to_execute [external-assets] Change build_assets_job interface to use executable_assets and loadable_assets Feb 13, 2024
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from d38dd8d to b9050c4 Compare February 13, 2024 15:58
@smackesey smackesey force-pushed the sean/make-asset-execution-type-property branch from 46799c0 to 72f2a10 Compare February 13, 2024 16:03
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from b9050c4 to ba4542e Compare February 13, 2024 16:06
@smackesey smackesey force-pushed the sean/make-asset-execution-type-property branch from 72f2a10 to 461f3a7 Compare February 13, 2024 16:22
@smackesey smackesey changed the base branch from sean/make-asset-execution-type-property to sean/external-auto-observe-interval-minutes February 13, 2024 16:43
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from ba4542e to 048a38d Compare February 13, 2024 16:43
@smackesey smackesey force-pushed the sean/external-auto-observe-interval-minutes branch from f1f5c9e to 3f2aba0 Compare February 13, 2024 17:09
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from 048a38d to a2f2c3d Compare February 13, 2024 17:09
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from a2f2c3d to ce80133 Compare February 13, 2024 17:18
@smackesey smackesey force-pushed the sean/external-auto-observe-interval-minutes branch from 3f2aba0 to b6156fe Compare February 13, 2024 17:41
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from ce80133 to b83dc77 Compare February 13, 2024 18:13
@smackesey smackesey force-pushed the sean/external-auto-observe-interval-minutes branch 3 times, most recently from 4e87476 to 15a91a8 Compare February 14, 2024 16:06
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from b83dc77 to 0f4d3e1 Compare February 14, 2024 16:15
@smackesey smackesey force-pushed the sean/external-auto-observe-interval-minutes branch 2 times, most recently from d73afae to 354ac39 Compare February 15, 2024 10:04
Base automatically changed from sean/external-auto-observe-interval-minutes to master February 15, 2024 10:35
@smackesey smackesey force-pushed the sean/external-assets-observe branch from 13c86b0 to 9052bc8 Compare February 20, 2024 12:29
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from 1ae5a2c to 03b5a31 Compare February 20, 2024 12:29
@smackesey smackesey force-pushed the sean/external-assets-observe branch from 9052bc8 to fd7e3f9 Compare February 20, 2024 14:34
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from 03b5a31 to c58fbba Compare February 20, 2024 14:34
@smackesey smackesey force-pushed the sean/external-assets-observe branch from fd7e3f9 to 40a9855 Compare February 20, 2024 15:40
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from c58fbba to d3673a7 Compare February 20, 2024 15:40
@smackesey smackesey force-pushed the sean/external-assets-observe branch 2 times, most recently from 214c718 to 9d4a240 Compare February 20, 2024 20:46
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from d3673a7 to 96bc75b Compare February 20, 2024 20:46
@smackesey smackesey force-pushed the sean/external-assets-observe branch from 9d4a240 to bc6baf2 Compare February 20, 2024 21:08
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from 96bc75b to 30b2395 Compare February 20, 2024 21:08
@smackesey smackesey force-pushed the sean/external-assets-observe branch from bc6baf2 to d32e251 Compare February 20, 2024 21:10
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch 2 times, most recently from 70c1932 to 1d3da61 Compare February 20, 2024 21:19
@smackesey smackesey marked this pull request as ready for review February 20, 2024 21:20
@smackesey smackesey requested a review from schrockn February 20, 2024 21:20
Copy link
Member

@schrockn schrockn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. Please heed final comment on invariants.

assets_by_partitions_def[observable.partitions_def] = []
if len(assets_by_partitions_def.keys()) == 0 or assets_by_partitions_def.keys() == {None}:
for asset in executable_assets:
executable_assets_by_partitions_def[asset.partitions_def].append(asset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very odd that PartitionsDefinition objects are hashable but this predates this PR

Comment on lines +79 to +81
# sort to ensure some stability in the ordering
all_partitions_defs = sorted(
[p for p in executable_assets_by_partitions_def.keys() if p], key=repr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and indeed on the next line something odd as a result of it being hashable

Comment on lines +128 to +129
executable_assets: Sequence[Union[AssetsDefinition, SourceAsset]],
loadable_assets: Optional[Sequence[Union[SourceAsset, AssetsDefinition]]] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much much better than older names

check.iterable_param(assets, "assets", of_type=(AssetsDefinition, SourceAsset))
source_assets = check.opt_sequence_param(
source_assets, "source_assets", of_type=(SourceAsset, AssetsDefinition)
check.iterable_param(executable_assets, "assets", of_type=(AssetsDefinition, SourceAsset))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also iterate through this and insure that all the source assets are observable as an invariant.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, can actually just check is_executable so that it flags unexecutable externals as well.

Base automatically changed from sean/external-assets-observe to master February 20, 2024 21:53
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch 3 times, most recently from a3e7f79 to 72747d0 Compare February 21, 2024 11:52
@smackesey smackesey force-pushed the sean/external-assets-assets-to-execute branch from 72747d0 to ef13c4f Compare February 21, 2024 12:07
@smackesey smackesey merged commit caa44a2 into master Feb 21, 2024
1 check passed
@smackesey smackesey deleted the sean/external-assets-assets-to-execute branch February 21, 2024 12:35
PedramNavid pushed a commit that referenced this pull request Mar 28, 2024
…e_assets` and `loadable_assets` (#19765)

## Summary & Motivation

`build_assets_job` (an internal API) is the codepath through which all
asset job construction flows. Currently, it accepts two lists of assets:

- `assets: Sequence[AssetsDefinition]`
- `source_assets: Sequence[Union[AssetsDefinition, SourceAsset]]`

This is quite confusing. In particular, `source_assets` is poorly named.
Not only does it contain `AssetsDefinition` (which gets internally
converted to `SourceAsset` inside the function) but it also contains
observable source assets that actually have nodes that will be executed.

This PR changes `build_assets_job` to accept:

- `executable_assets: Sequence[Union[AssetsDefinition, SourceAsset]]`
- `loadable_assets: Sequence[Union[AssetsDefinition, SourceAsset]]`

This makes the purpose of these two assets arrays more clear. It can be
considered an incremental step in the source asset -> external asset
refactor.

The PR leaves the internal logic of `build_assets_job` unchanged (though
it will be changed upstack), passing separate collections of
`AssetsDefinition` and `SourceAsset` further into the system per the
status quo.

## How I Tested These Changes

Existing test suite.

A few GQL snapshots are updated due to minor changes in the asset base
jobs in the GQL repo. Specifically, the unexecutable external asset is
now no longer included in any of the base jobs where previously it was
included in all of them (as all unpartitioned assets defs were). This is
not a function of change to the `build_assets_job` logic but rather to
the base job construction routine (which is one of the two
`build_assets_job` callsites).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants