-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs][pipes] - Subprocess page #17096
[docs][pipes] - Subprocess page #17096
Conversation
Current dependencies on/for this PR:
This comment was auto-generated by Graphite. |
c8629f0
to
54f16f6
Compare
Deploy preview for dagster-docs ready! Preview available at https://dagster-docs-mdh8zw8nu-elementl.vercel.app Direct link to changed pages: |
54f16f6
to
21ccaeb
Compare
|
||
## Step 1: Create a Dagster asset that executes a subprocess | ||
|
||
In this step, you’ll create a Dagster asset that, when materialized, opens a Dagster pipes session and invokes subprocess that executes some external code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reads awkwardly as "when materialized" could be interpreted as "when the external script has completed"
"asset that, in its execution function" could be good language.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we refer to pipes process elsewhere, pipes session; what is the noun?
- Provided <PyObject object="AssetExecutionContext" /> as the `context` argument to the asset. This object provides system information such as resources, config, and logging. We’ll come back to this a bit later in this section. | ||
- Specified a resource for the asset to use, `PipesSubprocessClient`. We’ll also come back to this in a little bit. | ||
- Declared a command list `cmd` to run the external script. In the list: | ||
- First, found the path to the Python executable on the system using `shutil.which("python")`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would note that this is fairly contrived and that this becomes more compelling if the external script is either 1) invoked using a different python virtual env or 2) is a different programming language.
docs/content/guides/dagster-pipes/integrating-subprocess-with-dagster-pipes.mdx
Show resolved
Hide resolved
docs/content/guides/dagster-pipes/integrating-subprocess-with-dagster-pipes.mdx
Show resolved
Hide resolved
docs/content/guides/dagster-pipes/integrating-subprocess-with-dagster-pipes.mdx
Show resolved
Hide resolved
docs/content/guides/dagster-pipes/integrating-subprocess-with-dagster-pipes.mdx
Show resolved
Hide resolved
|
||
- The `PipesSubprocessClient` resource used by the asset contains a `run` method. | ||
- When the asset is executed, this method will synchronously execute the subprocess with in a pipes session, and it will return a `PipesClientCompletedInvocation` object. | ||
- This object contains a `get_results` method, which you can use to access the results reported by the subprocess. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is results
here?
I think it'd also be worth mentioning why PipesSubprocessClient
exists instead of just opening a subprocess manually.
docs/content/guides/dagster-pipes/integrating-subprocess-with-dagster-pipes.mdx
Show resolved
Hide resolved
close in favor of the alternative structure: #17108 |
## Summary & Motivation Instead of the previous PRs cross-referencing each other, we make them part 1 and part 2, with a note in part 1 to navigate the stakeholder personas (who have dagster assets set up by others) to jump directly to part 2. It also splits out a standalone "Reference" page to show dagster code and external code side-by-side for advanced pipes + subprocess usage. Concretely, this PR rearranges the content of #17092 and #17096 into: <img width="200" alt="Screenshot 2023-10-09 at 4 48 17 PM" src="https://github.com/dagster-io/dagster/assets/4531914/3371e6b1-7ddc-4e67-b98b-2ec599fadca3"> Preview Links: - [Dagster Pipes + subprocess](https://10-09--docs-pipes-rfc-subprocess-section.dagster.dagster-docs.io/guides/dagster-pipes/subprocess) * [Tutorial part 1: Define a Dagster asset](https://10-09--docs-pipes-rfc-subprocess-section.dagster.dagster-docs.io/guides/dagster-pipes/subprocess/create-subprocess-asset) * [Tutorial part 2: Modify external code](https://10-09--docs-pipes-rfc-subprocess-section.dagster.dagster-docs.io/guides/dagster-pipes/subprocess/modify-external-code) * [Reference](https://10-09--docs-pipes-rfc-subprocess-section.dagster.dagster-docs.io/guides/dagster-pipes/subprocess/reference) - this includes example code for both dagster side and external side listed side by side in tabs TODOs - [x] incorporate feedback from previous PRs - [x] update screenshots - all of them right now are for reviewing purpose only. will update once the overall flow is reviewed. - [ ] ~add tests for code snippets~ will add in a separate follow-up PR ## How I Tested These Changes
Instead of the previous PRs cross-referencing each other, we make them part 1 and part 2, with a note in part 1 to navigate the stakeholder personas (who have dagster assets set up by others) to jump directly to part 2. It also splits out a standalone "Reference" page to show dagster code and external code side-by-side for advanced pipes + subprocess usage. Concretely, this PR rearranges the content of #17092 and #17096 into: <img width="200" alt="Screenshot 2023-10-09 at 4 48 17 PM" src="https://github.com/dagster-io/dagster/assets/4531914/3371e6b1-7ddc-4e67-b98b-2ec599fadca3"> Preview Links: - [Dagster Pipes + subprocess](https://10-09--docs-pipes-rfc-subprocess-section.dagster.dagster-docs.io/guides/dagster-pipes/subprocess) * [Tutorial part 1: Define a Dagster asset](https://10-09--docs-pipes-rfc-subprocess-section.dagster.dagster-docs.io/guides/dagster-pipes/subprocess/create-subprocess-asset) * [Tutorial part 2: Modify external code](https://10-09--docs-pipes-rfc-subprocess-section.dagster.dagster-docs.io/guides/dagster-pipes/subprocess/modify-external-code) * [Reference](https://10-09--docs-pipes-rfc-subprocess-section.dagster.dagster-docs.io/guides/dagster-pipes/subprocess/reference) - this includes example code for both dagster side and external side listed side by side in tabs TODOs - [x] incorporate feedback from previous PRs - [x] update screenshots - all of them right now are for reviewing purpose only. will update once the overall flow is reviewed. - [ ] ~add tests for code snippets~ will add in a separate follow-up PR
Summary & Motivation
depends on #17092
This PR adds "Integrating Subprocess with Dagster Pipes" page which is tailored to the data platform eng persona. this page is also designed to pair with "Modifying existing code to work with Dagster Pipes" page (its parent pr in the stack).
build will be green once #17048 is landed.
TODOs
How I Tested These Changes
https://10-09--docs-pipes---subprocess-page.dagster.dagster-docs.io/guides/dagster-pipes/integrating-subprocess-with-dagster-pipes