Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only have one kind of context for direct invocation #17554

Merged
merged 44 commits into from
Jan 29, 2024

Conversation

jamiedemaria
Copy link
Contributor

@jamiedemaria jamiedemaria commented Oct 31, 2023

Summary & Motivation

This PR merges UnboundOpExecutionContext and BoundOpExecutionContext into a single object, DirectOpExecutionContext.

There are three states the DirectOpExecutionContext can be in:

  1. pre-execution. The context is not tied to a particular asset or op to be executed. Therefore information like op_config or asset_key is not accessible. We call this state "unbound" or "not bound"
  2. during execution. The context is now tied to a particular asset or op that is being executed. The information above (like op_config) is now accessible. Additionally, the user may do things that mutate the context, like emit user events or add output metadata. We call this state "bound". This state of the context is the BoundOpExecutionContext in the old version of the code.
  3. post-execution. The context is no longer tied to a particular op/asset execution, however, information about emitted events or output metadata still needs to be available to the user so they can assert on the contents of these events. The context is also considered "unbound" in this state.

In order to make the distinction between these states more clear and easier to maintain for engineers, the attributes associated with each state are split into separate objects:
BoundProperties - maintains properties that are only available when the context is bound to an asset or op execution. When the context is bound to an invocation of a particular op, the BoundProperties object will be instantiated with the relevant properties for the context as bound to that op. At the end of execution, this object is deleted. We track the bound/unbound state of the context by the existence of the BoundProperties object.
DirectExecutionProperties - maintains attributes that can only be mutated while the context is bound, but can be read at any time (pre or post execution). If another execution begins using the same context, the DirectExecutionProperties object is replaced with a fresh instantiation.

Previous PR that does a similar thing but wasn't merged #15083

This PR accomplishes two main goals:

  • Makes is easier to add an AssetExecutionContext for direct invocation since the pattern for a direct invocation context is simplified.
  • Generally cleans up the direct invocation path by removing a confusing section of code. Sean put it well:
    • Motivation:
      Currently there are two classes UnboundOpExecutionContext and BoundOpExecutionContext. Both classes inherit from OpExecutionContext. UnboundOpExecutionContext is returned by build_op_context, i.e. it is used for direct op invocation. This is converted to a BoundOpExecutionContext in the call method of OpDefinition when an op is invoked.
      ~ The status quo is problematic for a few reasons:
      ~ The names (Un)BoundOpExecutionContext are confusing. It makes it sound like these classes are used for all op executions, when in fact they are never used as part of a run-- only for what is called "invocation".
      There is a large amount of code duplication between the two classes. This is hard to maintain and probably a source of bugs.

How I Tested These Changes

new unit tests

@jamiedemaria
Copy link
Contributor Author

jamiedemaria commented Oct 31, 2023

Current dependencies on/for this PR:

This stack of pull requests is managed by Graphite.

)


class UnboundOpExecutionContext(OpExecutionContext):
"""The ``context`` object available as the first argument to a solid's compute function when
class DirectInvocationOpExecutionContext(OpExecutionContext):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is fine, much clearer than previous name

Copy link
Collaborator

@smackesey smackesey Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to suggest RunlessOpExecutionContext:

  • Aligns with "runless" events
  • Communicates more about what makes this different from a regular OpExecutionContext-- contextual info about the run is missing
  • "Invocation" is not very self-explanatory and is overloaded-- it is used for instance when building a graph with the composition API inside @job (PendingNodeInvocation)
  • Shorter than DirectInvocationOpExecutionContext

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ I find this thinking compelling. Would also be ok with just DirectOpExecutionContext

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also relevant: #18265 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it should be since it is returned by public functions that build test contexts.

I think of it as an implementation detail "private" subclass of the real public parent class. I believe the typehints on those build test context methods are the public parent classes. It doesn't quite live up to the aspirations of being able to treat it like an OpExecutionContext and have everything work. I do like treating it as private and being able to change it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^ ok makes sense

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schrockn do you like TestOpExecutionContext?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump for @schrockn to weigh in on naming

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aligning on DirectOpExecutionContext. Reasoning:

  • Some users do full integration tests (like calling materialize) in these cases they won't be getting a "test" context, but they are still in a testing setup.
  • Direct side steps this issue because it's for the case when you are "directly" providing a context object to an op/asset
  • If we realize later the name needs to be different, the impact of changing the name later is low since it's a non-exported class

been validated.
"""

_op_def: OpDefinition
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand what these were in the BoundOpExecutionContext for. I can add them to the new class though if they are needed!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For historical context here - op_def was added so that after you've actually invoked an op, you could access definition-level properties about that op on the context object itself. This is useful for the situation where you use op factories for example, and might still need to query definition-level information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought since these were on the BoundExecutionContext that they wouldn't be accessible after invocation since the bound context gets garbage collected

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right - i meant during invocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok - this should still work in the new setup

@jamiedemaria jamiedemaria force-pushed the jamie/level-1-collapse-DI-contexts branch from 05af4f3 to 1387482 Compare November 2, 2023 19:13
Copy link

github-actions bot commented Nov 2, 2023

Deploy preview for dagit-storybook ready!

✅ Preview
https://dagit-storybook-eesafdcx5-elementl.vercel.app
https://jamie-level-1-collapse-DI-contexts.components-storybook.dagster-docs.io

Built with commit 3096474.
This pull request is being automatically deployed with vercel-action

Copy link

github-actions bot commented Nov 2, 2023

Deploy preview for dagit-core-storybook ready!

✅ Preview
https://dagit-core-storybook-inlpsqbme-elementl.vercel.app
https://jamie-level-1-collapse-DI-contexts.core-storybook.dagster-docs.io

Built with commit 3096474.
This pull request is being automatically deployed with vercel-action

Copy link

github-actions bot commented Nov 2, 2023

Deploy preview for dagster-docs ready!

Preview available at https://dagster-docs-px1j6c7ou-elementl.vercel.app
https://jamie-level-1-collapse-DI-contexts.dagster.dagster-docs.io

Direct link to changed pages:

@jamiedemaria jamiedemaria force-pushed the jamie/level-1-collapse-DI-contexts branch 2 times, most recently from 1116a08 to 8e5aa5a Compare November 13, 2023 18:30
@jamiedemaria jamiedemaria changed the title [wip] [prototype] Only have one kind of context for direct invocation [RFC] Only have one kind of context for direct invocation Nov 13, 2023
@jamiedemaria jamiedemaria marked this pull request as ready for review November 13, 2023 18:34
@jamiedemaria jamiedemaria force-pushed the jamie/level-1-collapse-DI-contexts branch 2 times, most recently from 3df3b58 to 4de29b1 Compare November 15, 2023 17:13
@jamiedemaria jamiedemaria force-pushed the jamie/level-1-collapse-DI-contexts branch 2 times, most recently from f8b02ae to 96e242d Compare November 16, 2023 21:09
Copy link

github-actions bot commented Nov 16, 2023

Deploy preview for dagster-university ready!

✅ Preview
https://dagster-university-gclhjjgau-elementl.vercel.app
https://jamie-level-1-collapse-DI-contexts.dagster-university.dagster-docs.io

Built with commit 3096474.
This pull request is being automatically deployed with vercel-action

@jamiedemaria jamiedemaria changed the base branch from master to jamie/update-build-asset-context-usages November 17, 2023 21:46
@jamiedemaria jamiedemaria force-pushed the jamie/level-1-collapse-DI-contexts branch from 96e242d to dee10a0 Compare November 17, 2023 21:46
@jamiedemaria jamiedemaria force-pushed the jamie/level-1-collapse-DI-contexts branch from dee10a0 to a31a3da Compare November 17, 2023 21:53
@jamiedemaria jamiedemaria force-pushed the jamie/level-1-collapse-DI-contexts branch from 3096474 to 92ea8d8 Compare January 29, 2024 15:38
@jamiedemaria jamiedemaria merged commit 07923e8 into master Jan 29, 2024
1 check passed
@jamiedemaria jamiedemaria deleted the jamie/level-1-collapse-DI-contexts branch January 29, 2024 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants