Skip to content

Commit

Permalink
Separate pause and cancel (#14179)
Browse files Browse the repository at this point in the history
  • Loading branch information
billpalombi authored Jun 20, 2024
1 parent d680a31 commit b944f33
Show file tree
Hide file tree
Showing 3 changed files with 93 additions and 89 deletions.
88 changes: 88 additions & 0 deletions docs/3.0rc/develop/control-workflows/cancel.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
---
title: Cancel flow runs
description: Learn the different ways to cancel a flow run.
---

## Cancel a flow run

You may cancel a scheduled or in-progress flow run from the CLI, UI, REST API, or Python client.

When requesting cancellation, the flow run moves to a "Cancelling" state.
If the deployment is a work pool-based deployment with a worker, then the worker monitors
the state of flow runs and detects that cancellation is requested.
The worker then sends a signal to the flow run infrastructure, requesting termination of the run.
If the run does not terminate after a grace period (default of 30 seconds), the infrastructure is killed, ensuring the flow run exits.

<Warning>
**A deployment is required**
Flow run cancellation requires that the flow run is associated with a deployment.
A monitoring process must be running to enforce the cancellation.

Inline subflow runs (those created without `run_deployment`), cannot be cancelled without cancelling the parent flow run.
To cancel a subflow run independent of its parent flow run, we recommend deploying it separately
and starting it using the [run_deployment](/3.0rc/api-ref/prefect/deployments/deployments/#prefect.deployments.deployments.run_deployment)
function.
</Warning>

Cancellation is resilient to restarts of Prefect workers.
To enable this, we attach metadata about the created infrastructure to the flow run.
Internally, this is referred to as the `infrastructure_pid` or infrastructure identifier.
Generally, this is composed of two parts:

- Scope: identifying where the infrastructure is running.
- ID: a unique identifier for the infrastructure within the scope.

The scope ensures that Prefect does not kill the wrong infrastructure.
For example, workers running on multiple machines may have overlapping process IDs but should not have a matching scope.

The identifiers for infrastructure types are:

- Processes: The machine hostname and the PID.
- Docker Containers: The Docker API URL and container ID.
- Kubernetes Jobs: The Kubernetes cluster name and the job name.

While the cancellation process is robust, there are a few issues than can occur:

- If the infrastructure for the flow run does not support cancellation, cancellation will not work.
- If the identifier scope does not match when attempting to cancel a flow run, the worker cannot cancel the flow run.
Another worker may attempt cancellation.
- If the infrastructure associated with the run cannot be found or has already been killed, the worker marks the flow run as cancelled.
- If the `infrastructre_pid` is missing, the flow run is marked as cancelled but cancellation cannot be enforced.
- If the worker runs into an unexpected error during cancellation, the flow run may or may not be cancelled
depending on where the error occurred. The worker will try again to cancel the flow run. Another worker may attempt cancellation.

### Cancel through the CLI

From the command line in your execution environment, you can cancel a flow run by using the
`prefect flow-run cancel` CLI command, passing the ID of the flow run.

<div class="terminal">
```bash
prefect flow-run cancel 'a55a4804-9e3c-4042-8b59-b3b6b7618736'
```
</div>

### Cancel through the UI

Navigate to the flow run's detail page and click `Cancel` in the upper right corner.

![Prefect UI](/3.0rc/img/ui/flow-run-cancellation-ui.png)

## Timeouts

Flow timeouts prevent unintentional long-running flows. When the duration of execution for a flow
exceeds the duration specified in the timeout, a timeout exception is raised and the flow is marked as failed.
In the UI, the flow is visibly designated as `TimedOut`.

Timeout durations are specified using the `timeout_seconds` keyword argument.

```python hl_lines="4"
from prefect import flow
import time

@flow(timeout_seconds=1, log_prints=True)
def show_timeouts():
print("I will execute")
time.sleep(5)
print("I will not execute")
```
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
---
title: Pause or cancel a flow run
description: Learn the different ways to pause, suspend, and cancel a flow run.
title: Pause and resume flow runs
description: Learn the different ways to pause, suspend, and resume a flow run.
---


## Pause or suspend a flow run

Prefect allows you to halt a flow run with two functions that are similar, but slightly different.
Expand Down Expand Up @@ -212,88 +211,4 @@ After successful validation, the flow run resumes, and the return value of the `
is an instance of the `UserNameInput` model containing the provided data.

For more information on receiving input from users when pausing and suspending flow runs,
see [Create interactive workflows](/3.0rc/develop/control-workflows/).

## Cancel a flow run

You may cancel a scheduled or in-progress flow run from the CLI, UI, REST API, or Python client.

When requesting cancellation, the flow run moves to a "Cancelling" state.
If the deployment is a work pool-based deployment with a worker, then the worker monitors
the state of flow runs and detects that cancellation is requested.
The worker then sends a signal to the flow run infrastructure, requesting termination of the run.
If the run does not terminate after a grace period (default of 30 seconds), the infrastructure is killed, ensuring the flow run exits.

<Warning>
**A deployment is required**
Flow run cancellation requires that the flow run is associated with a deployment.
A monitoring process must be running to enforce the cancellation.

Inline subflow runs (those created without `run_deployment`), cannot be cancelled without cancelling the parent flow run.
To cancel a subflow run independent of its parent flow run, we recommend deploying it separately
and starting it using the [run_deployment](/3.0rc/api-ref/prefect/deployments/deployments/#prefect.deployments.deployments.run_deployment)
function.
</Warning>

Cancellation is resilient to restarts of Prefect workers.
To enable this, we attach metadata about the created infrastructure to the flow run.
Internally, this is referred to as the `infrastructure_pid` or infrastructure identifier.
Generally, this is composed of two parts:

- Scope: identifying where the infrastructure is running.
- ID: a unique identifier for the infrastructure within the scope.

The scope ensures that Prefect does not kill the wrong infrastructure.
For example, workers running on multiple machines may have overlapping process IDs but should not have a matching scope.

The identifiers for infrastructure types are:

- Processes: The machine hostname and the PID.
- Docker Containers: The Docker API URL and container ID.
- Kubernetes Jobs: The Kubernetes cluster name and the job name.

While the cancellation process is robust, there are a few issues than can occur:

- If the infrastructure for the flow run does not support cancellation, cancellation will not work.
- If the identifier scope does not match when attempting to cancel a flow run, the worker cannot cancel the flow run.
Another worker may attempt cancellation.
- If the infrastructure associated with the run cannot be found or has already been killed, the worker marks the flow run as cancelled.
- If the `infrastructre_pid` is missing, the flow run is marked as cancelled but cancellation cannot be enforced.
- If the worker runs into an unexpected error during cancellation, the flow run may or may not be cancelled
depending on where the error occurred. The worker will try again to cancel the flow run. Another worker may attempt cancellation.

### Cancel through the CLI

From the command line in your execution environment, you can cancel a flow run by using the
`prefect flow-run cancel` CLI command, passing the ID of the flow run.

<div class="terminal">
```bash
prefect flow-run cancel 'a55a4804-9e3c-4042-8b59-b3b6b7618736'
```
</div>

### Cancel through the UI

Navigate to the flow run's detail page and click `Cancel` in the upper right corner.

![Prefect UI](/3.0rc/img/ui/flow-run-cancellation-ui.png)

## Timeouts

Flow timeouts prevent unintentional long-running flows. When the duration of execution for a flow
exceeds the duration specified in the timeout, a timeout exception is raised and the flow is marked as failed.
In the UI, the flow is visibly designated as `TimedOut`.

Timeout durations are specified using the `timeout_seconds` keyword argument.

```python hl_lines="4"
from prefect import flow
import time

@flow(timeout_seconds=1, log_prints=True)
def show_timeouts():
print("I will execute")
time.sleep(5)
print("I will not execute")
```
see [Create interactive workflows](/3.0rc/develop/control-workflows/).
3 changes: 2 additions & 1 deletion docs/mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,8 @@
"pages": [
"3.0rc/develop/control-workflows/task-run-limits",
"3.0rc/develop/control-workflows/global-concurrency-limits",
"3.0rc/develop/control-workflows/pause-cancel-flows",
"3.0rc/develop/control-workflows/pause-resume",
"3.0rc/develop/control-workflows/cancel",
"3.0rc/develop/control-workflows/inputs"
]
},
Expand Down

0 comments on commit b944f33

Please sign in to comment.