Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix.simulation mode started bug #6351

Merged
merged 1 commit into from
Sep 4, 2024

Conversation

wxtim
Copy link
Member

@wxtim wxtim commented Sep 3, 2024

Fixes a bug discovered by @oliver-sanders in the process of reviewing skip mode work (#6039 (comment)). The bug however, is long-standing (it goes back at least as far as Cylc 8.0.0.

Summary

In simulation mode, job starting is done by the job submission pathway as a side-effect (because in sim mode the two might as well be identical). However, the pathway is not currently spawning children of started.

Example

Run this

# flow.cylc
[scheduling]
    [[graph]]
        R1 = a? & a:started => b

[runtime]
  [[root]]
    run mode  = skip

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • Tests are included (or explain why tests are not needed).
  • Changelog entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

@wxtim wxtim self-assigned this Sep 3, 2024
@wxtim wxtim force-pushed the fix.simulation_mode_started_bug branch from 9453fb9 to 28f4475 Compare September 3, 2024 09:35
@wxtim wxtim changed the base branch from master to 8.3.x September 3, 2024 09:36
@wxtim wxtim requested a review from MetRonnie September 3, 2024 09:36
@wxtim wxtim mentioned this pull request Sep 3, 2024
8 tasks
Copy link
Member

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

This is a bug (note just a shortcoming in simulation mode) because (as Tim noted) the DB lists the "started" output, but the downstream task is never triggered which is inconsistent.

I did wonder whether this should be fixed by getting the submit code to send the started output like so:

diff --git a/cylc/flow/task_job_mgr.py b/cylc/flow/task_job_mgr.py
index 185966ff1..7a6451605 100644
--- a/cylc/flow/task_job_mgr.py
+++ b/cylc/flow/task_job_mgr.py
@@ -1028,6 +1028,9 @@ class TaskJobManager:
             self.task_events_mgr.process_message(
                 itask, INFO, TASK_OUTPUT_SUBMITTED,
             )
+            self.task_events_mgr.process_message(
+                itask, INFO, TASK_OUTPUT_STARTED,
+            )
             self.workflow_db_mgr.put_insert_task_jobs(
                 itask, {
                     'time_submit': now_str,

But quickly realised why we don't do that:

ERROR - 'execution polling intervals'
    Traceback (most recent call last):
    ...
    KeyError: 'execution polling intervals'

This makes Cylc think that there is actually a job running, it tries to setup health checks which fail for obvious reasons.

tests/integration/run_modes/test_simulation.py Outdated Show resolved Hide resolved
Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Not merging in case @oliver-sanders comment requires a small change?

@wxtim wxtim force-pushed the fix.simulation_mode_started_bug branch from a8c153e to 650f987 Compare September 4, 2024 08:33
@wxtim wxtim removed the request for review from MetRonnie September 4, 2024 08:37
@wxtim
Copy link
Member Author

wxtim commented Sep 4, 2024

@oliver-sanders - final check that you are happy with my response to your comment?

@oliver-sanders
Copy link
Member

Not sure which one, but yes I'm happy!

@oliver-sanders oliver-sanders added this to the 8.3.4 milestone Sep 4, 2024
@oliver-sanders oliver-sanders added the bug Something is wrong :( label Sep 4, 2024
@oliver-sanders oliver-sanders merged commit f30f0c2 into cylc:8.3.x Sep 4, 2024
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is wrong :(
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants