-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pipes] edit to dagster pipes details/customization guide #17192
Conversation
Current dependencies on/for this PR: This comment was auto-generated by Graphite. |
|
||
1. The creation of communications channels between the orchestration and external process; | ||
2. The launching and terminating of the external process; | ||
3. The processing of all messages reported by the external process and the closing of communications channels. | ||
|
||
Pipes sessions are created in the orchestration process with the <PyObject object="open_pipes_session" /> context manager. This function is exported by `dagster` and should be called inside of an asset or op compute function - somewhere where an <PyObject object="OpExecutionContext" /> or <PyObject object="AssetExecutionContext" /> is available: | ||
There are separate APIs for interacting with a Pipes session in the orchestration and external processes. For the orchestration process, these APIs are defined in `dagster`. For the external process, these APIs are defined by a Pipes integration library that is loaded by user code in the external process. This library knows how to interpret the bootstrap payload and spin up a context loader and message writer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused by this.
"For the external process, these APIs are defined by a Pipes integration library that is loaded by user code in the external process."
Do you mean just dagster-pipes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to emphasize that the external process library can be more than just dagster-pipes
-- dagster-pipes
explained in the next paragraph. Feel free to propose improvement here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a bit confusing, since we don't have any additional libraries right now.
Deploy preview for dagster-docs ready! Preview available at https://dagster-docs-mhhvp4f7j-elementl.vercel.app Direct link to changed pages: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This guide is fantastic.
I think we can work on the graphic a bit, but we can do that separately
- **Opening routine**: Writes the context payload and spins up the message reader (which usually involves starting a thread to continually read messages). These steps may involve the creation of temporary resources, such as a temporary file (locally or on some remote system) for the context payload or a temporary directory to which messages will be written. | ||
- **Body**: User code should handle launching, polling, and termination of the external process here. While the external process is executing, any intermediate results that have been received can be reported to Dagster with `yield from pipes_session.get_results()`. | ||
- **Closing routine**: Ensures that all messages written by the external process have been read into the orchestration process and cleans up any resources used by the context injector and message reader. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is super clear 👍🏻
8ae659f
to
d81e68d
Compare
d81e68d
to
be74c81
Compare
Updates including: - Fix typos etc - Convert more object refs to formal `PyObject` links - Some light restructuring of the pipes lifecycle sections - Moving all code snippets into `docs_snippets` (and fixing import errors etc) - Added "Logs" to the terms and diagram - A note about customizing log handling Browsed.
Summary & Motivation
Updates including:
PyObject
linksdocs_snippets
(and fixing import errors etc)How I Tested These Changes
Browsed.