Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argo UI incorrectly shows DAG exit-handler step in Pending state when Workflow manually stopped and exit-handler completes #13964

Open
3 of 4 tasks
encigem opened this issue Dec 3, 2024 · 2 comments

Comments

@encigem
Copy link

encigem commented Dec 3, 2024

Pre-requisites

  • I have double-checked my configuration
  • I have tested with the :latest image tag (i.e. quay.io/argoproj/workflow-controller:latest) and can confirm the issue still exists on :latest. If not, I have explained why, in detail, in my description below.
  • I have searched existing issues and could not find a match for this bug
  • I'd like to contribute the fix myself (see contributing guide)

What happened? What did you expect to happen?

The UI should reflect the true state of all Workflow steps/tasks/outputs/etc., but this is not the case with regards to DAG exit-handlers as they successfully complete execution, but still are displayed as 'Pending' on the UI:
test1-UI
test2-UI

Version(s)

v3.5.10, v3.6.0-rc4, v3.6.2

Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

Example1 (using suspend step, issue triggered at stoppage any time during suspend step):

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: test-stop-bug-example-
spec:
  entrypoint: main
  onExit: exit-handler       # Issue occurs (using a DAG)
  # onExit: cleanup-step     # Issue does not occur (not using a DAG)
  arguments:
    parameters:
      - name: image
        value: "alpine:latest"
      - name: duration
        value: 60
  templates:
    - name: main
      dag:
        tasks:
          - name: suspend-dag      # Manually stop this step to reproduce the bug, when using a DAG exit-handler
            template: suspend-step
    - name: exit-handler
      dag:
        tasks:
          - name: cleanup-dag
            template: cleanup-step

    - name: suspend-step
      suspend:
        duration: '{{workflow.parameters.duration}}'
    - name: cleanup-step
      container:
        image: '{{workflow.parameters.image}}'
        command: ['echo']
        args: ['Executed on-exit cleanup step']

Example2 (issue occurs when stoppage is within ~5seconds of step completion):

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: test-suspend-bug-
spec:
  entrypoint: main
  onExit: cleanup-dag
  arguments:
    parameters:
      - name: image
        value: "alpine:latest"
      - name: duration
        value: 20

  templates:
    - name: main
      steps:
      - - name: sleep1 # Manually stop this step ~5 seconds before completion to reproduce the bug
          template: sleep
    - name: cleanup-dag
      dag:
        tasks:
        - name: cleanup
          template: cleanup

    - name: sleep
      container:
        image: '{{workflow.parameters.image}}'
        command: ['sh', '-c']
        args:
        - |
          i={{workflow.parameters.duration}}
          while [ $i -gt 0 ]; do
            echo "$i seconds left before exiting"
            sleep 1
            i=$(($i-1))
          done
          echo 'Done!'
    - name: cleanup
      container:
        image: '{{workflow.parameters.image}}'
        command: ['echo']
        args: ['Executed on-exit cleanup step']

Logs from the workflow controller

kubectl logs -n argo deploy/argo-workflow-controller

time="2024-12-03T13:21:36Z" level=info msg="index config" indexWorkflowSemaphoreKeys=true
time="2024-12-03T13:21:36Z" level=info msg="cron config" cronSyncPeriod=10s
time="2024-12-03T13:21:36Z" level=info msg="Memoization caches will be garbage-collected if they have not been hit after" gcAfterNotHitDuration=30s
time="2024-12-03T13:21:36.590Z" level=info msg="not enabling pprof debug endpoints"
time="2024-12-03T13:21:36.612Z" level=info msg="Configuration:\nartifactRepository:\n  archiveLogs: true\n  s3:\n    accessKeySecret:\n      key: s3_access_key\n      name: cluster-config-s3-auth\n    bucket: projectdev-moore030\n    endpoint: customis3obj.org.company.se:10444\n    insecure: false\n    keyFormat: artifacts/{{workflow.creationTimestamp.Y}}/{{workflow.creationTimestamp.m}}/{{workflow.creationTimestamp.d}}/{{workflow.name}}/{{pod.name}}\n    secretKeySecret:\n      key: s3_secret_key\n      name: cluster-config-s3-auth\ninitialDelay: 0s\nmetricsConfig:\n  enabled: true\n  path: /metrics\n  port: 9090\n  secure: false\nnodeEvents:\n  enabled: true\npodSpecLogStrategy: {}\nsso:\n  clientId:\n    key: client-id-key\n    name: client-id-secret\n  clientSecret:\n    key: client-secret-key\n    name: client-secret-secret\n  issuer: https://iam.npee.gic.company.se/auth/realms/project-production\n  rbac:\n    enabled: true\n  redirectUrl: https://argo-ui.moore030.rnd.gic.company.se/oauth2/callback\n  scopes:\n  - groups\n  - email\n  sessionExpiry: 0s\ntelemetryConfig: {}\nworkflowDefaults:\n  metadata:\n    creationTimestamp: null\n  spec:\n    arguments: {}\n    podGC:\n      deleteDelayDuration: 1h\n      strategy: OnWorkflowCompletion\n    ttlStrategy:\n      secondsAfterCompletion: 259200\n      secondsAfterFailure: 604800\n      secondsAfterSuccess: 864000\n  status:\n    finishedAt: null\n    startedAt: null\nworkflowEvents: {}\n"
time="2024-12-03T13:21:36.612Z" level=info msg="Persistence configuration disabled"
time="2024-12-03T13:21:36.612Z" level=info executorImage="customdocker.org.company.se/proj-project/dev/encigem/argoproj/argo-workflows/argoexec-egad:5df504dfae97baa3dcb877aed74b300c4dc2d5534b6704ccbdb6bb4b49835d2f" executorImagePullPolicy= managedNamespace=
time="2024-12-03T13:21:36.613Z" level=info msg="Starting Prometheus metrics exporter"
time="2024-12-03T13:21:36.614Z" level=info msg="Leader election is turned off. Running in single-instance mode"
time="2024-12-03T13:21:36.614Z" level=info msg="starting leading" id=single-instance
time="2024-12-03T13:21:36.614Z" level=info msg="DB migration is disabled"
time="2024-12-03T13:21:36.614Z" level=info msg="Starting Workflow Controller" defaultRequeueTime=10s version=v3.6.2
time="2024-12-03T13:21:36.614Z" level=info msg="Current Worker Numbers" cronWorkflowWorkers=8 podCleanup=4 workflowArchive=8 workflowTtlWorkers=4 workflowWorkers=32
time="2024-12-03T13:21:36.614Z" level=info msg="Starting prometheus metrics server at localhost:9090/metrics"
time="2024-12-03T13:21:36.614Z" level=info msg="Watching task results" labelSelector="!workflows.argoproj.io/controller-instanceid,workflows.argoproj.io/workflow"
time="2024-12-03T13:21:36.614Z" level=info msg=Plugins executorPlugins=false
time="2024-12-03T13:21:36.622Z" level=info msg="Manager initialized successfully"
time="2024-12-03T13:21:36.729Z" level=info msg="Received Workflow Controller config map argo/argo-workflow-controller-configmap update"
time="2024-12-03T13:21:36.740Z" level=info msg="Configuration:\nartifactRepository:\n  archiveLogs: true\n  s3:\n    accessKeySecret:\n      key: s3_access_key\n      name: cluster-config-s3-auth\n    bucket: projectdev-moore030\n    endpoint: customis3obj.org.company.se:10444\n    insecure: false\n    keyFormat: artifacts/{{workflow.creationTimestamp.Y}}/{{workflow.creationTimestamp.m}}/{{workflow.creationTimestamp.d}}/{{workflow.name}}/{{pod.name}}\n    secretKeySecret:\n      key: s3_secret_key\n      name: cluster-config-s3-auth\ninitialDelay: 0s\nmetricsConfig:\n  enabled: true\n  path: /metrics\n  port: 9090\n  secure: false\nnodeEvents:\n  enabled: true\npodSpecLogStrategy: {}\nsso:\n  clientId:\n    key: client-id-key\n    name: client-id-secret\n  clientSecret:\n    key: client-secret-key\n    name: client-secret-secret\n  issuer: https://iam.npee.gic.company.se/auth/realms/project-production\n  rbac:\n    enabled: true\n  redirectUrl: https://argo-ui.moore030.rnd.gic.company.se/oauth2/callback\n  scopes:\n  - groups\n  - email\n  sessionExpiry: 0s\ntelemetryConfig: {}\nworkflowDefaults:\n  metadata:\n    creationTimestamp: null\n  spec:\n    arguments: {}\n    podGC:\n      deleteDelayDuration: 1h\n      strategy: OnWorkflowCompletion\n    ttlStrategy:\n      secondsAfterCompletion: 259200\n      secondsAfterFailure: 604800\n      secondsAfterSuccess: 864000\n  status:\n    finishedAt: null\n    startedAt: null\nworkflowEvents: {}\n"
time="2024-12-03T13:21:36.740Z" level=info msg="Persistence configuration disabled"
time="2024-12-03T13:21:36.740Z" level=info executorImage="customdocker.org.company.se/proj-project/dev/encigem/argoproj/argo-workflows/argoexec-egad:5df504dfae97baa3dcb877aed74b300c4dc2d5534b6704ccbdb6bb4b49835d2f" executorImagePullPolicy= managedNamespace=
time="2024-12-03T13:21:36.999Z" level=info msg="Starting CronWorkflow controller"
time="2024-12-03T13:21:36.999Z" level=info msg="Performing periodic GC" periodicity=5m0s
time="2024-12-03T13:21:37.000Z" level=info msg="Persistence disabled - so archived workflow GC disabled - you must restart the controller if you enable this"
time="2024-12-03T13:21:37.000Z" level=info msg="Starting workflow garbage collector controller (retentionWorkers 4)"
time="2024-12-03T13:21:37.005Z" level=info msg="Started workflow garbage collection"
W1203 13:21:37.006058       1 shared_informer.go:463] The sharedIndexInformer has started, run more than once is not allowed
time="2024-12-03T13:23:17.414Z" level=info msg="Processing workflow" Phase= ResourceVersion=737689224 namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:17.422Z" level=info msg="resolved artifact repository" artifactRepositoryRef=default-artifact-repository
time="2024-12-03T13:23:17.422Z" level=info msg="Task-result reconciliation" namespace=project-il numObjs=0 workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:17.422Z" level=info msg="Updated phase  -> Running" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:17.422Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:17.422Z" level=info msg="was unable to obtain node for , letting display name to be nodeName" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:17.422Z" level=info msg="DAG node test-stop-bug-example-5m944 initialized Running" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:17.422Z" level=warning msg="was unable to obtain the node for test-stop-bug-example-5m944-4040385659, taskName suspend-dag"
time="2024-12-03T13:23:17.422Z" level=warning msg="was unable to obtain the node for test-stop-bug-example-5m944-4040385659, taskName suspend-dag"
time="2024-12-03T13:23:17.422Z" level=info msg="All of node test-stop-bug-example-5m944.suspend-dag dependencies [] completed" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:17.422Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:17.423Z" level=info msg="Suspend node test-stop-bug-example-5m944-4040385659 initialized Pending" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:17.423Z" level=info msg="node test-stop-bug-example-5m944.suspend-dag suspended" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:17.423Z" level=info msg="node test-stop-bug-example-5m944-4040385659 phase Pending -> Running" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:17.423Z" level=info msg="TaskSet Reconciliation" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:17.423Z" level=info msg=reconcileAgentPod namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:17.423Z" level=info msg="Workflow to be dehydrated" Workflow Size=1796
time="2024-12-03T13:23:17.444Z" level=info msg="Workflow update successful" namespace=project-il phase=Running resourceVersion=737689226 workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.447Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=737689275 namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.448Z" level=info msg="Task-result reconciliation" namespace=project-il numObjs=0 workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.448Z" level=info msg="node test-stop-bug-example-5m944-4040385659 phase Running -> Failed" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.448Z" level=info msg="node test-stop-bug-example-5m944-4040385659 message: Stopped with strategy 'Stop'" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.448Z" level=info msg="node test-stop-bug-example-5m944-4040385659 finished: 2024-12-03 13:23:27.448844435 +0000 UTC" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.449Z" level=info msg="Outbound nodes of test-stop-bug-example-5m944 set to [test-stop-bug-example-5m944-4040385659]" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.449Z" level=info msg="node test-stop-bug-example-5m944 phase Running -> Failed" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.449Z" level=info msg="node test-stop-bug-example-5m944 finished: 2024-12-03 13:23:27.449076738 +0000 UTC" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.449Z" level=info msg="Running OnExit handler: exit-handler" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.449Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.449Z" level=info msg="was unable to obtain node for , letting display name to be nodeName" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.449Z" level=info msg="DAG node test-stop-bug-example-5m944-924555479 initialized Running" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.449Z" level=warning msg="was unable to obtain the node for test-stop-bug-example-5m944-2119134400, taskName cleanup-dag"
time="2024-12-03T13:23:27.449Z" level=warning msg="was unable to obtain the node for test-stop-bug-example-5m944-2119134400, taskName cleanup-dag"
time="2024-12-03T13:23:27.449Z" level=info msg="All of node test-stop-bug-example-5m944.onExit.cleanup-dag dependencies [] completed" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.449Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.449Z" level=info msg="Pod node test-stop-bug-example-5m944-2119134400 initialized Pending" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.477Z" level=info msg="Created pod: test-stop-bug-example-5m944.onExit.cleanup-dag (test-stop-bug-example-5m944-cleanup-step-2119134400)" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.477Z" level=info msg="Outbound nodes of test-stop-bug-example-5m944-924555479 set to [test-stop-bug-example-5m944-2119134400]" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.477Z" level=info msg="node test-stop-bug-example-5m944-924555479 phase Running -> Failed" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.477Z" level=info msg="node test-stop-bug-example-5m944-924555479 finished: 2024-12-03 13:23:27.477800737 +0000 UTC" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.477Z" level=info msg="Updated phase Running -> Failed" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.477Z" level=info msg="Updated message  -> Stopped with strategy 'Stop'" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.477Z" level=info msg="Marking workflow completed" namespace=project-il workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.477Z" level=info msg="Workflow to be dehydrated" Workflow Size=2847
time="2024-12-03T13:23:27.504Z" level=info msg="Workflow update successful" namespace=project-il phase=Failed resourceVersion=737689292 workflow=test-stop-bug-example-5m944
time="2024-12-03T13:23:27.505Z" level=info msg="Queueing Failed workflow project-il/test-stop-bug-example-5m944 for delete in 168h0m0s due to TTL"
time="2024-12-03T13:23:27.565Z" level=info msg="cleaning up pod" action=removeFinalizer key=project-il/test-stop-bug-example-5m944-cleanup-step-2119134400/removeFinalizer
time="2024-12-03T13:23:27.598Z" level=info msg="insignificant pod change" key=project-il/test-stop-bug-example-5m944-cleanup-step-2119134400
time="2024-12-03T13:24:36.249Z" level=info msg="Processing workflow" Phase= ResourceVersion=737689710 namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:36.256Z" level=info msg="resolved artifact repository" artifactRepositoryRef=default-artifact-repository
time="2024-12-03T13:24:36.256Z" level=info msg="Task-result reconciliation" namespace=project-il numObjs=0 workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:36.256Z" level=info msg="Updated phase  -> Running" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:36.257Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:36.257Z" level=info msg="was unable to obtain node for , letting display name to be nodeName" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:36.257Z" level=info msg="Steps node test-stop-bug-example2-gkbbp initialized Running" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:36.258Z" level=info msg="StepGroup node test-stop-bug-example2-gkbbp-1000708489 initialized Running" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:36.259Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:36.259Z" level=info msg="Pod node test-stop-bug-example2-gkbbp-2154453211 initialized Pending" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:36.291Z" level=info msg="Created pod: test-stop-bug-example2-gkbbp[0].sleep1 (test-stop-bug-example2-gkbbp-sleep-2154453211)" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:36.291Z" level=info msg="Workflow step group node test-stop-bug-example2-gkbbp-1000708489 not yet completed" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:36.291Z" level=info msg="TaskSet Reconciliation" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:36.291Z" level=info msg=reconcileAgentPod namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:36.292Z" level=info msg="Workflow to be dehydrated" Workflow Size=2217
time="2024-12-03T13:24:36.315Z" level=info msg="Workflow update successful" namespace=project-il phase=Running resourceVersion=737689714 workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:46.296Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=737689714 namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:46.297Z" level=info msg="Task-result reconciliation" namespace=project-il numObjs=1 workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:46.298Z" level=info msg="node changed" namespace=project-il new.message= new.phase=Running new.progress=0/1 nodeID=test-stop-bug-example2-gkbbp-2154453211 old.message= old.phase=Pending old.progress=0/1 workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:46.298Z" level=info msg="Workflow step group node test-stop-bug-example2-gkbbp-1000708489 not yet completed" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:46.298Z" level=info msg="TaskSet Reconciliation" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:46.298Z" level=info msg=reconcileAgentPod namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:46.298Z" level=info msg="Workflow to be dehydrated" Workflow Size=2565
time="2024-12-03T13:24:46.325Z" level=info msg="Workflow update successful" namespace=project-il phase=Running resourceVersion=737689793 workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:56.328Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=737689793 namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:56.330Z" level=info msg="Task-result reconciliation" namespace=project-il numObjs=1 workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:56.330Z" level=info msg="node unchanged" namespace=project-il nodeID=test-stop-bug-example2-gkbbp-2154453211 workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:56.330Z" level=info msg="Workflow step group node test-stop-bug-example2-gkbbp-1000708489 not yet completed" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:56.330Z" level=info msg="TaskSet Reconciliation" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:56.330Z" level=info msg=reconcileAgentPod namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.592Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=737689882 namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.593Z" level=info msg="Task-result reconciliation" namespace=project-il numObjs=1 workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.593Z" level=info msg="task-result changed" namespace=project-il nodeID=test-stop-bug-example2-gkbbp-2154453211 workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.594Z" level=info msg="node changed" namespace=project-il new.message= new.phase=Succeeded new.progress=0/1 nodeID=test-stop-bug-example2-gkbbp-2154453211 old.message= old.phase=Running old.progress=0/1 workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.594Z" level=info msg="Step group node test-stop-bug-example2-gkbbp-1000708489 successful" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.594Z" level=info msg="node test-stop-bug-example2-gkbbp-1000708489 phase Running -> Succeeded" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.594Z" level=info msg="node test-stop-bug-example2-gkbbp-1000708489 finished: 2024-12-03 13:25:07.59465786 +0000 UTC" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.594Z" level=info msg="Outbound nodes of test-stop-bug-example2-gkbbp-2154453211 is [test-stop-bug-example2-gkbbp-2154453211]" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.594Z" level=info msg="Outbound nodes of test-stop-bug-example2-gkbbp is [test-stop-bug-example2-gkbbp-2154453211]" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.594Z" level=info msg="node test-stop-bug-example2-gkbbp phase Running -> Succeeded" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.594Z" level=info msg="node test-stop-bug-example2-gkbbp finished: 2024-12-03 13:25:07.594773382 +0000 UTC" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.594Z" level=info msg="Running OnExit handler: cleanup-dag" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.594Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.594Z" level=info msg="was unable to obtain node for , letting display name to be nodeName" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.594Z" level=info msg="DAG node test-stop-bug-example2-gkbbp-759617756 initialized Running" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.595Z" level=warning msg="was unable to obtain the node for test-stop-bug-example2-gkbbp-2095643750, taskName cleanup"
time="2024-12-03T13:25:07.595Z" level=warning msg="was unable to obtain the node for test-stop-bug-example2-gkbbp-2095643750, taskName cleanup"
time="2024-12-03T13:25:07.595Z" level=info msg="All of node test-stop-bug-example2-gkbbp.onExit.cleanup dependencies [] completed" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.595Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.595Z" level=info msg="Pod node test-stop-bug-example2-gkbbp-2095643750 initialized Pending" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.625Z" level=info msg="Created pod: test-stop-bug-example2-gkbbp.onExit.cleanup (test-stop-bug-example2-gkbbp-cleanup-2095643750)" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.625Z" level=info msg="Outbound nodes of test-stop-bug-example2-gkbbp-759617756 set to [test-stop-bug-example2-gkbbp-2095643750]" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.625Z" level=info msg="node test-stop-bug-example2-gkbbp-759617756 phase Running -> Failed" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.625Z" level=info msg="node test-stop-bug-example2-gkbbp-759617756 finished: 2024-12-03 13:25:07.625236423 +0000 UTC" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.625Z" level=info msg="Updated phase Running -> Failed" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.625Z" level=info msg="Marking workflow completed" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.625Z" level=info msg="Workflow to be dehydrated" Workflow Size=3538
time="2024-12-03T13:25:07.646Z" level=info msg="Workflow update successful" namespace=project-il phase=Failed resourceVersion=737689953 workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:25:07.648Z" level=info msg="Queueing Failed workflow project-il/test-stop-bug-example2-gkbbp for delete in 168h0m0s due to TTL"
time="2024-12-03T13:25:07.676Z" level=info msg="cleaning up pod" action=removeFinalizer key=project-il/test-stop-bug-example2-gkbbp-cleanup-2095643750/removeFinalizer
time="2024-12-03T13:25:07.676Z" level=info msg="cleaning up pod" action=removeFinalizer key=project-il/test-stop-bug-example2-gkbbp-sleep-2154453211/removeFinalizer
time="2024-12-03T13:25:07.708Z" level=info msg="insignificant pod change" key=project-il/test-stop-bug-example2-gkbbp-sleep-2154453211
time="2024-12-03T13:25:07.713Z" level=info msg="insignificant pod change" key=project-il/test-stop-bug-example2-gkbbp-cleanup-2095643750
time="2024-12-03T13:26:36.591Z" level=info msg="Alloc=16069 TotalAlloc=61685 Sys=35157 NumGC=11 Goroutines=181"
time="2024-12-03T13:31:36.590Z" level=info msg="Alloc=13637 TotalAlloc=76166 Sys=35157 NumGC=14 Goroutines=181"

kubectl logs -n argo deploy/argo-server

time="2024-12-03T13:21:36.636Z" level=info msg="not enabling pprof debug endpoints"
time="2024-12-03T13:21:36.637Z" level=info authModes="[sso client]" baseHRef=/ managedNamespace= namespace=argo secure=true ssoNamespace=argo
time="2024-12-03T13:21:36.637Z" level=info msg="Generating Self Signed TLS Certificates for Secure Mode"
time="2024-12-03T13:21:36.912Z" level=info msg="SSO configuration" clientId="{{client-id-secret} client-id-key <nil>}" filterGroupsRegex="[]" insecureSkipVerify=false issuer="https://iam.npee.gic.company.se/auth/realms/project-production" issuerAlias=DISABLED redirectUrl="https://argo-ui.moore030.rnd.gic.company.se/oauth2/callback" scopes="[groups email openid]"
time="2024-12-03T13:21:37.013Z" level=info msg="SSO enabled"
time="2024-12-03T13:21:37.019Z" level=info msg="Starting Argo Server" instanceID= version=v3.6.2
time="2024-12-03T13:21:37.019Z" level=info msg="Creating event controller" asyncDispatch=false operationQueueSize=16 workerCount=4
time="2024-12-03T13:21:37.030Z" level=info msg="GRPC Server Max Message Size, MaxGRPCMessageSize, is set" GRPC_MESSAGE_SIZE=209715200
time="2024-12-03T13:21:37.030Z" level=info msg="Argo Server started successfully on https://localhost:2746" url="https://localhost:2746"
time="2024-12-03T13:21:55.932Z" level=info duration="138.503µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:22:00.526Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:22:15.933Z" level=info duration="93.036µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:22:35.933Z" level=info duration="263.305µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:22:55.932Z" level=info duration="133.501µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:23:15.937Z" level=info duration="124.493µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:23:22.843Z" level=warning msg="Failed to set header 'argo-version': rpc error: code = Internal desc = transport: SendHeader called multiple times"
time="2024-12-03T13:23:22.843Z" level=info duration=1m22.321166776s method=GET path=/api/v1/workflow-events/ size=825 status=0
time="2024-12-03T13:23:22.843Z" level=info msg="finished streaming call with code OK" grpc.code=OK grpc.method=WatchWorkflows grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:22:00Z" grpc.time_ms=82320.14 span.kind=server system=grpc
time="2024-12-03T13:23:22.873Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:23:22.884Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetWorkflow grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:23:22Z" grpc.time_ms=14.6 span.kind=server system=grpc
time="2024-12-03T13:23:22.887Z" level=info duration=18.353594ms method=GET path=/api/v1/workflows/project-il/test-stop-bug-example-5m944 size=3375 status=0
time="2024-12-03T13:23:22.908Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:23:25.970Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:23:26.000Z" level=warning msg="Non-transient error: <nil>"
time="2024-12-03T13:23:26.009Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=StopWorkflow grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:23:25Z" grpc.time_ms=41.651 span.kind=server system=grpc
time="2024-12-03T13:23:26.011Z" level=info duration=43.829168ms method=PUT path=/api/v1/workflows/project-il/test-stop-bug-example-5m944/stop size=3393 status=0
time="2024-12-03T13:23:35.932Z" level=info duration="132.778µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:23:43.554Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:23:43.561Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetWorkflow grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:23:43Z" grpc.time_ms=9.64 span.kind=server system=grpc
time="2024-12-03T13:23:43.561Z" level=info duration=10.978705ms method=GET path=/api/v1/workflows/project-il/test-stop-bug-example-5m944 size=4605 status=0
time="2024-12-03T13:23:43.576Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:23:43.647Z" level=warning msg="Failed to set header 'argo-version': rpc error: code = Internal desc = transport: SendHeader called multiple times"
time="2024-12-03T13:23:43.647Z" level=info msg="finished streaming call with code OK" grpc.code=OK grpc.method=WorkflowLogs grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:23:43Z" grpc.time_ms=73.105 span.kind=server system=grpc
time="2024-12-03T13:23:43.647Z" level=info duration=74.020041ms method=GET path=/api/v1/workflows/project-il/test-stop-bug-example-5m944/log size=510 status=0
time="2024-12-03T13:23:43.699Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:23:43.706Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetWorkflow grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:23:43Z" grpc.time_ms=9.227 span.kind=server system=grpc
time="2024-12-03T13:23:43.707Z" level=info duration=10.419735ms method=GET path=/api/v1/workflows/project-il/test-stop-bug-example-5m944 size=4605 status=0
time="2024-12-03T13:23:55.932Z" level=info duration="153.653µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:24:15.933Z" level=info duration="149.986µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:24:35.933Z" level=info duration="267.713µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:24:40.145Z" level=info duration=1m17.240599636s method=GET path=/api/v1/workflow-events/project-il size=11641 status=0
time="2024-12-03T13:24:40.146Z" level=warning msg="Failed to set header 'argo-version': rpc error: code = Internal desc = transport: SendHeader called multiple times"
time="2024-12-03T13:24:40.146Z" level=info msg="finished streaming call with code OK" grpc.code=OK grpc.method=WatchWorkflows grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:23:22Z" grpc.time_ms=77240.48 span.kind=server system=grpc
time="2024-12-03T13:24:40.184Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:24:40.192Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:24:40.192Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:24:40.205Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=ListCronWorkflows grpc.service=cronworkflow.CronWorkflowService grpc.start_time="2024-12-03T13:24:40Z" grpc.time_ms=15.951 span.kind=server system=grpc
time="2024-12-03T13:24:40.208Z" level=info duration=19.127247ms method=GET path=/api/v1/cron-workflows/project-il size=57 status=0
time="2024-12-03T13:24:40.214Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=ListWorkflows grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:24:40Z" grpc.time_ms=23.98 span.kind=server system=grpc
time="2024-12-03T13:24:40.215Z" level=info duration=25.895091ms method=GET path=/api/v1/workflows/project-il size=995 status=0
time="2024-12-03T13:24:40.229Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:24:40.252Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=ListWorkflowTemplates grpc.service=workflowtemplate.WorkflowTemplateService grpc.start_time="2024-12-03T13:24:40Z" grpc.time_ms=72.186 span.kind=server system=grpc
time="2024-12-03T13:24:40.262Z" level=info duration=83.325254ms method=GET path=/api/v1/workflow-templates/project-il size=339087 status=0
time="2024-12-03T13:24:41.661Z" level=info duration=1.435896789s method=GET path=/api/v1/workflow-events/project-il size=11 status=408
time="2024-12-03T13:24:41.662Z" level=warning msg="Failed to set header 'argo-version': rpc error: code = Internal desc = transport: SendHeader called multiple times"
time="2024-12-03T13:24:41.662Z" level=info msg="finished streaming call with code OK" grpc.code=OK grpc.method=WatchWorkflows grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:24:40Z" grpc.time_ms=1435.938 span.kind=server system=grpc
time="2024-12-03T13:24:41.684Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:24:41.691Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetWorkflow grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:24:41Z" grpc.time_ms=9.717 span.kind=server system=grpc
time="2024-12-03T13:24:41.692Z" level=info duration=10.955746ms method=GET path=/api/v1/workflows/project-il/test-stop-bug-example2-gkbbp size=3960 status=0
time="2024-12-03T13:24:41.729Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:24:46.040Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:24:46.048Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetWorkflow grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:24:46Z" grpc.time_ms=10.696 span.kind=server system=grpc
time="2024-12-03T13:24:46.049Z" level=info duration=12.157411ms method=GET path=/api/v1/workflows/project-il/test-stop-bug-example2-gkbbp size=3960 status=0
time="2024-12-03T13:24:46.064Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:24:54.754Z" level=info duration=8.695649503s method=GET path=/api/v1/workflows/project-il/test-stop-bug-example2-gkbbp/log size=2244 status=0
time="2024-12-03T13:24:54.754Z" level=error msg="rpc error: code = Canceled desc = context canceled" namespace=project-il workflow=test-stop-bug-example2-gkbbp
time="2024-12-03T13:24:54.755Z" level=warning msg="Failed to set header 'argo-version': rpc error: code = Internal desc = transport: SendHeader called multiple times"
time="2024-12-03T13:24:54.755Z" level=info msg="finished streaming call with code OK" grpc.code=OK grpc.method=WorkflowLogs grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:24:46Z" grpc.time_ms=8696.052 span.kind=server system=grpc
time="2024-12-03T13:24:55.933Z" level=info duration="139.718µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:24:57.549Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:24:57.590Z" level=warning msg="Non-transient error: <nil>"
time="2024-12-03T13:24:57.595Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=StopWorkflow grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:24:57Z" grpc.time_ms=48.298 span.kind=server system=grpc
time="2024-12-03T13:24:57.595Z" level=info duration=49.449371ms method=PUT path=/api/v1/workflows/project-il/test-stop-bug-example2-gkbbp/stop size=4146 status=0
time="2024-12-03T13:25:15.933Z" level=info duration="171.75µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:25:35.932Z" level=info duration="153.399µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:25:37.998Z" level=info duration=56.286290134s method=GET path=/api/v1/workflow-events/project-il size=18145 status=0
time="2024-12-03T13:25:37.998Z" level=warning msg="Failed to set header 'argo-version': rpc error: code = Internal desc = transport: SendHeader called multiple times"
time="2024-12-03T13:25:37.998Z" level=info msg="finished streaming call with code OK" grpc.code=OK grpc.method=WatchWorkflows grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:24:41Z" grpc.time_ms=56281.19 span.kind=server system=grpc
time="2024-12-03T13:25:38.024Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:25:38.034Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:25:38.036Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:25:38.039Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=ListCronWorkflows grpc.service=cronworkflow.CronWorkflowService grpc.start_time="2024-12-03T13:25:38Z" grpc.time_ms=7.328 span.kind=server system=grpc
time="2024-12-03T13:25:38.039Z" level=info duration=8.080702ms method=GET path=/api/v1/cron-workflows/project-il size=57 status=0
time="2024-12-03T13:25:38.047Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=ListWorkflows grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:25:38Z" grpc.time_ms=13.567 span.kind=server system=grpc
time="2024-12-03T13:25:38.047Z" level=info duration=14.449921ms method=GET path=/api/v1/workflows/project-il size=1010 status=0
time="2024-12-03T13:25:38.061Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:25:38.080Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=ListWorkflowTemplates grpc.service=workflowtemplate.WorkflowTemplateService grpc.start_time="2024-12-03T13:25:38Z" grpc.time_ms=58.587 span.kind=server system=grpc
time="2024-12-03T13:25:38.085Z" level=info duration=64.627237ms method=GET path=/api/v1/workflow-templates/project-il size=339087 status=0
time="2024-12-03T13:25:39.221Z" level=info duration=1.162388114s method=GET path=/api/v1/workflow-events/project-il size=11 status=408
time="2024-12-03T13:25:39.221Z" level=warning msg="Failed to set header 'argo-version': rpc error: code = Internal desc = transport: SendHeader called multiple times"
time="2024-12-03T13:25:39.221Z" level=info msg="finished streaming call with code OK" grpc.code=OK grpc.method=WatchWorkflows grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:25:38Z" grpc.time_ms=1162.049 span.kind=server system=grpc
time="2024-12-03T13:25:39.242Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:25:39.248Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetWorkflow grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:25:39Z" grpc.time_ms=8.357 span.kind=server system=grpc
time="2024-12-03T13:25:39.249Z" level=info duration=9.335616ms method=GET path=/api/v1/workflows/project-il/test-stop-bug-example-5m944 size=4605 status=0
time="2024-12-03T13:25:39.266Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:25:55.935Z" level=info duration="213.729µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:26:04.880Z" level=info duration=25.616393103s method=GET path=/api/v1/workflow-events/project-il size=4765 status=0
time="2024-12-03T13:26:04.880Z" level=warning msg="Failed to set header 'argo-version': rpc error: code = Internal desc = transport: SendHeader called multiple times"
time="2024-12-03T13:26:04.880Z" level=info msg="finished streaming call with code OK" grpc.code=OK grpc.method=WatchWorkflows grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:25:39Z" grpc.time_ms=25616.283 span.kind=server system=grpc
time="2024-12-03T13:26:04.901Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:26:04.901Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:26:04.903Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:26:04.913Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=ListCronWorkflows grpc.service=cronworkflow.CronWorkflowService grpc.start_time="2024-12-03T13:26:04Z" grpc.time_ms=14.511 span.kind=server system=grpc
time="2024-12-03T13:26:04.913Z" level=info duration=15.260295ms method=GET path=/api/v1/cron-workflows/project-il size=57 status=0
time="2024-12-03T13:26:04.922Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=ListWorkflows grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:26:04Z" grpc.time_ms=21.06 span.kind=server system=grpc
time="2024-12-03T13:26:04.922Z" level=info duration=22.034979ms method=GET path=/api/v1/workflows/project-il size=1010 status=0
time="2024-12-03T13:26:04.934Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:26:04.972Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=ListWorkflowTemplates grpc.service=workflowtemplate.WorkflowTemplateService grpc.start_time="2024-12-03T13:26:04Z" grpc.time_ms=76.554 span.kind=server system=grpc
time="2024-12-03T13:26:04.978Z" level=info duration=82.952462ms method=GET path=/api/v1/workflow-templates/project-il size=339087 status=0
time="2024-12-03T13:26:06.436Z" level=info duration=1.504317031s method=GET path=/api/v1/workflow-events/project-il size=11 status=408
time="2024-12-03T13:26:06.436Z" level=warning msg="Failed to set header 'argo-version': rpc error: code = Internal desc = transport: SendHeader called multiple times"
time="2024-12-03T13:26:06.436Z" level=info msg="finished streaming call with code OK" grpc.code=OK grpc.method=WatchWorkflows grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:26:04Z" grpc.time_ms=1504.166 span.kind=server system=grpc
time="2024-12-03T13:26:06.456Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:26:06.463Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetWorkflow grpc.service=workflow.WorkflowService grpc.start_time="2024-12-03T13:26:06Z" grpc.time_ms=9.122 span.kind=server system=grpc
time="2024-12-03T13:26:06.464Z" level=info duration=10.429331ms method=GET path=/api/v1/workflows/project-il/test-stop-bug-example2-gkbbp size=5597 status=0
time="2024-12-03T13:26:06.480Z" level=info msg="selected SSO RBAC service account for user" [email protected] loginServiceAccount=daas-argo-sso-users serviceAccount=daas-argo-sso-users ssoDelegated=false ssoDelegationAllowed=false subject=55fabed9-e302-452b-b05d-aafc02e02135
time="2024-12-03T13:26:15.932Z" level=info duration="155.608µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:26:35.932Z" level=info duration="159.12µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:26:36.637Z" level=info msg="Alloc=22234 TotalAlloc=58554 Sys=43093 NumGC=7 Goroutines=115"
time="2024-12-03T13:26:55.932Z" level=info duration="144.16µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:27:15.932Z" level=info duration="129.443µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:27:35.932Z" level=info duration="138.4µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:27:55.934Z" level=info duration="357.488µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:28:15.933Z" level=info duration="147.079µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:28:35.932Z" level=info duration="118.614µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:28:55.932Z" level=info duration="157.489µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:29:15.933Z" level=info duration="212.551µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:29:35.933Z" level=info duration="217.762µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:29:55.932Z" level=info duration="136.095µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:30:15.933Z" level=info duration="246.706µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:30:35.932Z" level=info duration="148.719µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:30:55.933Z" level=info duration="140.396µs" method=GET path=index.html size=487 status=0
time="2024-12-03T13:31:15.934Z" level=info duration="132.806µs" method=GET path=index.html size=487 status=0

Logs from in your workflow's wait container

kubectl logs -n project-il -c wait -l workflows.argoproj.io/workflow=${workflow} [example1]:

time="2024-12-03T13:23:29.857Z" level=info msg="Starting Workflow Executor" version=v3.6.2
time="2024-12-03T13:23:29.861Z" level=info msg="Using executor retry strategy" Duration=1s Factor=1.6 Jitter=0.5 Steps=5
time="2024-12-03T13:23:29.861Z" level=info msg="Executor initialized" deadline="0001-01-01 00:00:00 +0000 UTC" includeScriptOutput=false namespace=project-il podName=test-stop-bug-example-5m944-cleanup-step-2119134400 templateName=cleanup-step version="&Version{Version:v3.6.2,BuildDate:2024-12-02T14:13:35Z,GitCommit:741ab0ef7b6432925e49882cb4294adccf5912ec,GitTag:v3.6.2,GitTreeState:clean,GoVersion:go1.23.3,Compiler:gc,Platform:linux/amd64,}"
time="2024-12-03T13:23:29.885Z" level=info msg="Starting deadline monitor"
time="2024-12-03T13:23:31.886Z" level=info msg="Main container completed" error="<nil>"
time="2024-12-03T13:23:31.886Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2024-12-03T13:23:31.886Z" level=info msg="No output parameters"
time="2024-12-03T13:23:31.886Z" level=info msg="No output artifacts"
time="2024-12-03T13:23:31.887Z" level=info msg="S3 Save path: /tmp/argo/outputs/logs/main.log, key: artifacts/2024/12/03/test-stop-bug-example-5m944/test-stop-bug-example-5m944-cleanup-step-2119134400/main.log"
time="2024-12-03T13:23:31.887Z" level=info msg="Creating minio client using static credentials" endpoint="customis3obj.custom.gic.company.se:10444"
time="2024-12-03T13:23:31.888Z" level=info msg="Saving file to s3" bucket=projectdev-moore030 endpoint="customis3obj.custom.gic.company.se:10444" key=artifacts/2024/12/03/test-stop-bug-example-5m944/test-stop-bug-example-5m944-cleanup-step-2119134400/main.log path=/tmp/argo/outputs/logs/main.log
time="2024-12-03T13:23:31.972Z" level=info msg="Save artifact" artifactName=main-logs duration=85.453726ms error="<nil>" key=artifacts/2024/12/03/test-stop-bug-example-5m944/test-stop-bug-example-5m944-cleanup-step-2119134400/main.log
time="2024-12-03T13:23:31.972Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2024-12-03T13:23:31.972Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2024-12-03T13:23:32.011Z" level=info msg="Alloc=17648 TotalAlloc=22815 Sys=31061 NumGC=3 Goroutines=10"

kubectl logs -n project-il -c wait -l workflows.argoproj.io/workflow=${workflow} [example2]:

time="2024-12-03T13:24:38.130Z" level=info msg="Starting Workflow Executor" version=v3.6.2
time="2024-12-03T13:24:38.134Z" level=info msg="Using executor retry strategy" Duration=1s Factor=1.6 Jitter=0.5 Steps=5
time="2024-12-03T13:24:38.134Z" level=info msg="Executor initialized" deadline="0001-01-01 00:00:00 +0000 UTC" includeScriptOutput=false namespace=project-il podName=test-stop-bug-example2-gkbbp-sleep-2154453211 templateName=sleep version="&Version{Version:v3.6.2,BuildDate:2024-12-02T14:13:35Z,GitCommit:741ab0ef7b6432925e49882cb4294adccf5912ec,GitTag:v3.6.2,GitTreeState:clean,GoVersion:go1.23.3,Compiler:gc,Platform:linux/amd64,}"
time="2024-12-03T13:24:38.164Z" level=info msg="Starting deadline monitor"
time="2024-12-03T13:25:00.183Z" level=info msg="Main container completed" error="<nil>"
time="2024-12-03T13:25:00.183Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2024-12-03T13:25:00.183Z" level=info msg="No output parameters"
time="2024-12-03T13:25:00.183Z" level=info msg="No output artifacts"
time="2024-12-03T13:25:00.184Z" level=info msg="S3 Save path: /tmp/argo/outputs/logs/main.log, key: artifacts/2024/12/03/test-stop-bug-example2-gkbbp/test-stop-bug-example2-gkbbp-sleep-2154453211/main.log"
time="2024-12-03T13:25:00.184Z" level=info msg="Creating minio client using static credentials" endpoint="customis3obj.custom.gic.company.se:10444"
time="2024-12-03T13:25:00.185Z" level=info msg="Saving file to s3" bucket=projectdev-moore030 endpoint="customis3obj.custom.gic.company.se:10444" key=artifacts/2024/12/03/test-stop-bug-example2-gkbbp/test-stop-bug-example2-gkbbp-sleep-2154453211/main.log path=/tmp/argo/outputs/logs/main.log
time="2024-12-03T13:25:00.388Z" level=info msg="Save artifact" artifactName=main-logs duration=204.752561ms error="<nil>" key=artifacts/2024/12/03/test-stop-bug-example2-gkbbp/test-stop-bug-example2-gkbbp-sleep-2154453211/main.log
time="2024-12-03T13:25:00.389Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2024-12-03T13:25:00.389Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2024-12-03T13:25:00.429Z" level=info msg="Alloc=17539 TotalAlloc=22846 Sys=30805 NumGC=3 Goroutines=10"

time="2024-12-03T13:25:09.224Z" level=info msg="Starting Workflow Executor" version=v3.6.2
time="2024-12-03T13:25:09.228Z" level=info msg="Using executor retry strategy" Duration=1s Factor=1.6 Jitter=0.5 Steps=5
time="2024-12-03T13:25:09.228Z" level=info msg="Executor initialized" deadline="0001-01-01 00:00:00 +0000 UTC" includeScriptOutput=false namespace=project-il podName=test-stop-bug-example2-gkbbp-cleanup-2095643750 templateName=cleanup version="&Version{Version:v3.6.2,BuildDate:2024-12-02T14:13:35Z,GitCommit:741ab0ef7b6432925e49882cb4294adccf5912ec,GitTag:v3.6.2,GitTreeState:clean,GoVersion:go1.23.3,Compiler:gc,Platform:linux/amd64,}"
time="2024-12-03T13:25:09.262Z" level=info msg="Starting deadline monitor"
time="2024-12-03T13:25:11.262Z" level=info msg="Main container completed" error="<nil>"
time="2024-12-03T13:25:11.262Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2024-12-03T13:25:11.262Z" level=info msg="No output parameters"
time="2024-12-03T13:25:11.262Z" level=info msg="No output artifacts"
time="2024-12-03T13:25:11.263Z" level=info msg="S3 Save path: /tmp/argo/outputs/logs/main.log, key: artifacts/2024/12/03/test-stop-bug-example2-gkbbp/test-stop-bug-example2-gkbbp-cleanup-2095643750/main.log"
time="2024-12-03T13:25:11.263Z" level=info msg="Creating minio client using static credentials" endpoint="customis3obj.custom.gic.company.se:10444"
time="2024-12-03T13:25:11.264Z" level=info msg="Saving file to s3" bucket=projectdev-moore030 endpoint="customis3obj.custom.gic.company.se:10444" key=artifacts/2024/12/03/test-stop-bug-example2-gkbbp/test-stop-bug-example2-gkbbp-cleanup-2095643750/main.log path=/tmp/argo/outputs/logs/main.log
time="2024-12-03T13:25:11.340Z" level=info msg="Save artifact" artifactName=main-logs duration=76.884621ms error="<nil>" key=artifacts/2024/12/03/test-stop-bug-example2-gkbbp/test-stop-bug-example2-gkbbp-cleanup-2095643750/main.log
time="2024-12-03T13:25:11.340Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2024-12-03T13:25:11.340Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2024-12-03T13:25:11.377Z" level=info msg="Alloc=12693 TotalAlloc=22816 Sys=31061 NumGC=4 Goroutines=10"
@encigem
Copy link
Author

encigem commented Dec 3, 2024

The Workflows are Completed (failed) and the individual steps are also Completed, too:

>>> kubectl get wf -n project-il
NAME                           STATUS   AGE   MESSAGE
test-stop-bug-example-5m944    Failed   23m   Stopped with strategy 'Stop'
test-stop-bug-example2-gkbbp   Failed   22m
>>> kubectl get po -n project-il
NAME                                                  READY   STATUS      RESTARTS   AGE
test-stop-bug-example-5m944-cleanup-step-2119134400   0/2     Completed   0          23m
test-stop-bug-example2-gkbbp-cleanup-2095643750       0/2     Completed   0          22m
test-stop-bug-example2-gkbbp-sleep-2154453211         0/2     Completed   0          22m

@jswxstw jswxstw added area/templates/dag area/exit-handler area/shutdown Shutdown Strategy: Stop and Terminate labels Dec 6, 2024
@jswxstw
Copy link
Member

jswxstw commented Dec 6, 2024

onExitCompleted := true
for _, taskName := range targetTasks {
woc.executeDAGTask(ctx, dagCtx, taskName)
// It is possible that target tasks are not reconsidered (i.e. executeDAGTask is not called on them) once they are
// complete (since the DAG itself will have succeeded). To ensure that their exit handlers are run we also run them here. Note that
// calls to runOnExitNode are idempotent: it is fine if they are called more than once for the same task.
taskNode := dagCtx.getTaskNode(taskName)
if taskNode != nil {
task := dagCtx.GetTask(taskName)
scope, err := woc.buildLocalScopeFromTask(dagCtx, task)
if err != nil {
woc.markNodeError(node.Name, err)
return node, err
}
scope.addParamToScope(fmt.Sprintf("tasks.%s.status", task.Name), string(taskNode.Phase))
_, err = woc.executeTmplLifeCycleHook(ctx, scope, dagCtx.GetTask(taskName).Hooks, taskNode, dagCtx.boundaryID, dagCtx.tmplCtx, "tasks."+taskName)
if err != nil {
woc.markNodeError(node.Name, err)
return node, err
}
if taskNode.Fulfilled() {
if taskNode.Completed() {
hasOnExitNode, onExitNode, err := woc.runOnExitNode(ctx, dagCtx.GetTask(taskName).GetExitHook(woc.execWf.Spec.Arguments), taskNode, dagCtx.boundaryID, dagCtx.tmplCtx, "tasks."+taskName, scope)
if err != nil {
return node, err
}
if hasOnExitNode && (onExitNode == nil || !onExitNode.Fulfilled()) {
onExitCompleted = false
}
}
}
}
}

It seems that there is an issue with the logic for judging onExitCompleted here.
In this case, onExitCompleted will always be true because the targetTask itself is part of exit-handler and its taskNode is not Fulfilled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants