You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have tested with the :latest image tag (i.e. quay.io/argoproj/workflow-controller:latest) and can confirm the issue still exists on :latest. If not, I have explained why, in detail, in my description below.
I have searched existing issues and could not find a match for this bug
I have a WorkflowTemplate that defines sequential workflows using a DAG.
The DAG is responsible solely for managing the sequence of workflows, and each step references a workflow using templateRef.
Some of these steps (workflows) implement their own mutex synchronization mechanisms, but these are not applied to the entire WorkflowTemplate. In other words, certain steps need synchronization, while others do not.
The issue arises when a workflow with its own mutex completes execution, even after its pod is finished.
At this point, another parallel workflow becomes stuck with the message:
Waiting for … lock. Lock status: 0/1
From my understanding, once the pod with the mutex has completed execution, the mutex should be released, allowing the next workflow to acquire the lock.
However, it appears that mutexes for all workflows within the template are only released after the entire WorkflowTemplate has completed execution.
Am I misunderstanding how the mutex synchronization works in this context?
Or is there a configuration or behavior I may have overlooked that ensures the mutex is released immediately after the specific workflow (or pod) finishes?
I register this issue with version 3.5.5, because there have been no updates regarding this feature.
Version(s)
v3.5.5
Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflow that uses private images.
Pre-requisites
:latest
image tag (i.e.quay.io/argoproj/workflow-controller:latest
) and can confirm the issue still exists on:latest
. If not, I have explained why, in detail, in my description below.What happened? What did you expect to happen?
I have a
WorkflowTemplate
that defines sequential workflows using a DAG.The DAG is responsible solely for managing the sequence of workflows, and each step references a workflow using templateRef.
Some of these steps (workflows) implement their own mutex synchronization mechanisms, but these are not applied to the entire
WorkflowTemplate
. In other words, certain steps need synchronization, while others do not.The issue arises when a workflow with its own mutex completes execution, even after its pod is finished.
At this point, another parallel workflow becomes stuck with the message:
From my understanding, once the pod with the mutex has completed execution, the mutex should be released, allowing the next workflow to acquire the lock.
However, it appears that mutexes for all workflows within the template are only released after the entire WorkflowTemplate has completed execution.
Am I misunderstanding how the mutex synchronization works in this context?
Or is there a configuration or behavior I may have overlooked that ensures the mutex is released immediately after the specific workflow (or pod) finishes?
I register this issue with version 3.5.5, because there have been no updates regarding this feature.
Version(s)
v3.5.5
Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflow that uses private images.
The following is the WorkflowTemplate
And the following is the referred workflow.
Logs from the workflow controller
Logs from in your workflow's wait container
The text was updated successfully, but these errors were encountered: