Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pipes] edit to dagster pipes details/customization guide #17192

Merged
merged 1 commit into from
Oct 13, 2023

Conversation

smackesey
Copy link
Collaborator

Summary & Motivation

Updates including:

  • Fix typos etc
  • Convert more object refs to formal PyObject links
  • Some light restructuring of the pipes lifecycle sections
  • Moving all code snippets into docs_snippets (and fixing import errors etc)
  • Added "Logs" to the terms and diagram
  • A note about customizing log handling

How I Tested These Changes

Browsed.

@smackesey
Copy link
Collaborator Author

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@smackesey smackesey requested a review from schrockn October 13, 2023 14:17

1. The creation of communications channels between the orchestration and external process;
2. The launching and terminating of the external process;
3. The processing of all messages reported by the external process and the closing of communications channels.

Pipes sessions are created in the orchestration process with the <PyObject object="open_pipes_session" /> context manager. This function is exported by `dagster` and should be called inside of an asset or op compute function - somewhere where an <PyObject object="OpExecutionContext" /> or <PyObject object="AssetExecutionContext" /> is available:
There are separate APIs for interacting with a Pipes session in the orchestration and external processes. For the orchestration process, these APIs are defined in `dagster`. For the external process, these APIs are defined by a Pipes integration library that is loaded by user code in the external process. This library knows how to interpret the bootstrap payload and spin up a context loader and message writer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this.

"For the external process, these APIs are defined by a Pipes integration library that is loaded by user code in the external process."

Do you mean just dagster-pipes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to emphasize that the external process library can be more than just dagster-pipes-- dagster-pipes explained in the next paragraph. Feel free to propose improvement here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a bit confusing, since we don't have any additional libraries right now.

Copy link
Member

@schrockn schrockn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This guide is fantastic.

I think we can work on the graphic a bit, but we can do that separately

Comment on lines 243 to 248
- **Opening routine**: Writes the context payload and spins up the message reader (which usually involves starting a thread to continually read messages). These steps may involve the creation of temporary resources, such as a temporary file (locally or on some remote system) for the context payload or a temporary directory to which messages will be written.
- **Body**: User code should handle launching, polling, and termination of the external process here. While the external process is executing, any intermediate results that have been received can be reported to Dagster with `yield from pipes_session.get_results()`.
- **Closing routine**: Ensures that all messages written by the external process have been read into the orchestration process and cleans up any resources used by the context injector and message reader.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is super clear 👍🏻

@smackesey smackesey force-pushed the sean/pipes-guide-edits branch 2 times, most recently from 8ae659f to d81e68d Compare October 13, 2023 15:37
@smackesey smackesey force-pushed the sean/pipes-guide-edits branch from d81e68d to be74c81 Compare October 13, 2023 15:50
@smackesey smackesey merged commit a053f70 into master Oct 13, 2023
1 of 2 checks passed
@smackesey smackesey deleted the sean/pipes-guide-edits branch October 13, 2023 16:04
yuhan pushed a commit that referenced this pull request Oct 13, 2023
Updates including:

- Fix typos etc
- Convert more object refs to formal `PyObject` links
- Some light restructuring of the pipes lifecycle sections
- Moving all code snippets into `docs_snippets` (and fixing import
errors etc)
- Added "Logs" to the terms and diagram
- A note about customizing log handling

Browsed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants