Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control deployment concurrency limits with an orchestration policy #15475

Closed
abrookins opened this issue Sep 24, 2024 · 0 comments · Fixed by #15504
Closed

Control deployment concurrency limits with an orchestration policy #15475

abrookins opened this issue Sep 24, 2024 · 0 comments · Fixed by #15504
Labels
enhancement An improvement of an existing feature

Comments

@abrookins
Copy link
Contributor

abrookins commented Sep 24, 2024

Describe the current behavior

In #15085 and #15022, we made workers aware of deployment concurrency limits. This placed our ability to release a concurrency slot closest to our knowledge of when and why a flow run succeeded or failed, so we could reliably release a concurrency slot.

However, this has the tradeoff that for users to use deployment concurrency limits, they need to upgrade their workers to use the latest version of Prefect. We want to support users who do not or cannot upgrade their workers.

Describe the proposed behavior

The best approach to accomplish the goal of enabling older clients to use deployment concurrency limits is to move concurrency handling into new SecureFlowRunConcurrencySlots and ReleaseFlowRunConcurrencySlots rules in the core orchestration policy for flows. If we handle deployment concurrency as an orchestration rule, workers' attempts to propose states like Pending or Failed would acquire (or release) concurrency slots server-side.

For workers to propose a new state for a flow run, they need to have the same knowledge they would need to acquire or release a concurrency slot. Here are some cases that will still work the same with an orchestration policy instead of worker logic for deployment concurrency limits:

  1. A worker needs to acquire a concurrency slot before spinning up expensive infrastructure for a flow run. This will happen transparently to the worker when it proposes Pending and receives a rejection because a slot isn't available. The orchestration policy will reschedule the run to try again.
  2. A flow.serve() worker runs a flow with a deployment concurrency limit. No special handling is needed here because, just like other workers, when the worker proposes a Pending state, orchestration logic will acquire a slot.
  3. A worker knows that the infrastructure running a flow crashed. It can propose a Crashed state for the flow, which will transparently release a concurrency slot.

Note: Prefect 2's task run concurrency hooked into the Running state. However, in this case, we want to prevent setting up flow run infrastructure if we don't have a concurrency slot. So this proposal uses hooks into the Pending state to acquire a concurrency slot.

Example Use

No response

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement of an existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant