Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

task_pool: fix compute_runahead calculation #5833

Closed
wants to merge 3 commits into from

Conversation

oliver-sanders
Copy link
Member

@oliver-sanders oliver-sanders commented Nov 21, 2023

  • Closes reload: reloading shakes out runahead tasks #5825
  • Cycles were considered active if they contained runahead limited tasks.
  • This could cause the runahead limit to be bumped forwards whenever the limit calculation was forced to update, e.g. on reload.
  • This filters out tasks at or beyond the runahead limit and straigntens out the task status checks to match Cylc 7 behaviour in compat mode.

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • Tests are included (or explain why tests are not needed).
  • CHANGES.md entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

@oliver-sanders oliver-sanders added the bug Something is wrong :( label Nov 21, 2023
@oliver-sanders oliver-sanders added this to the cylc-8.2.4 milestone Nov 21, 2023
@oliver-sanders oliver-sanders self-assigned this Nov 21, 2023
@@ -362,20 +362,35 @@ def compute_runahead(self, force=False) -> bool:
# Find the earliest point with unfinished tasks.
for point, itasks in sorted(self.get_tasks_by_point().items()):
if (
points # got the limit already so this point too
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could never be true as points is populated on a different branch.

Copy link
Member

@hjoliver hjoliver Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's populated on this branch too, within the same loop. So the logic says to automatically append the next points once we have added the first one based on the state checks. (However, I'm not sure why it wasn't just stopping at the first one, since then we just take the min anyway).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're iterating in sorted order the earlier cycles should come first, so a break should make sense?

Copy link
Member Author

@oliver-sanders oliver-sanders Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's using points later in ways which don't cause test failures, so I'll restore the original to avoid breakages.

Actually, no, adding this line back breaks the new tests. I'm not sure why we would add cycle points with no active tasks to the points list?

Comment on lines -367 to -371
not itask.state(
TASK_STATUS_FAILED,
TASK_STATUS_SUCCEEDED,
TASK_STATUS_EXPIRED
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This counted waiting runahead tasks.

It should be superfluous if not cylc.flow.flags.cylc7_back_compat because is_incomplete should cover it anyway.

@oliver-sanders oliver-sanders changed the base branch from master to 8.2.x November 21, 2023 14:50
@oliver-sanders oliver-sanders marked this pull request as draft November 21, 2023 15:48
@oliver-sanders oliver-sanders marked this pull request as ready for review November 22, 2023 15:37
Comment on lines +83 to +84
series using the same pytest runner. This incentivises breaking up larger test
modules.
Copy link
Member

@wxtim wxtim Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible alternative

Suggested change
series using the same pytest runner. This incentivises breaking up larger test
modules.
series using the same pytest runner. Consider breaking up
large test modules to allow tests to be run in parallel.

@MetRonnie
Copy link
Member

Got integration test failures

@oliver-sanders
Copy link
Member Author

Got integration test failures

See the OP, caused by the same issue fixed in #5826

@MetRonnie
Copy link
Member

Is that the correct PR? It doesn't mention closing an issue, and also is against master rather than 8.2.x

Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new logic is correct, but can we just bail out of the loop once the first point is found? (given the loop is over sorted points, and we only want the min point in the end).

tests/integration/test_task_pool.py Show resolved Hide resolved
@@ -362,20 +362,35 @@ def compute_runahead(self, force=False) -> bool:
# Find the earliest point with unfinished tasks.
for point, itasks in sorted(self.get_tasks_by_point().items()):
if (
points # got the limit already so this point too
Copy link
Member

@hjoliver hjoliver Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's populated on this branch too, within the same loop. So the logic says to automatically append the next points once we have added the first one based on the state checks. (However, I'm not sure why it wasn't just stopping at the first one, since then we just take the min anyway).

cylc/flow/task_pool.py Show resolved Hide resolved
@oliver-sanders
Copy link
Member Author

oliver-sanders commented Nov 28, 2023

Is that the correct PR? It doesn't mention closing an issue, and also is against master rather than 8.2.x

Yes, it's the right PR, the issue bring fixed was the way the time zone variable was set in the tests. Will have to pull the fix into 8.2.x.

* Closes cylc#5825
* Cycles were considered active if they contained runahead limited
  tasks.
* This could cause the runahead limit to be bumped forwards whenever the
  limit calculation was forced to update, e.g. on reload.
* This filters out tasks at or beyond the runahead limit and straigntens
  out the task status checks to match Cylc 7 behaviour in compat mode.
* Setting the TZ env var within a Python session doesn't work as
  intended.
* We'll add a TZ offset in CI in the near future.
Comment on lines 36 to +38
# a weird timezone to check that tests aren't assuming the local timezone
TZ=XXX-09:35
# Note: this doesn't work properly with pytest-xdist
# TZ=XXX-09:35
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prevents the new test from failing.

Setting the TZ variable in this way doesn't work as intended.

A solution will arrive in #5826

@oliver-sanders
Copy link
Member Author

tests/f/runahead all pass, but a bunch of others fail, back to draft.

@oliver-sanders oliver-sanders marked this pull request as draft November 28, 2023 11:21
@oliver-sanders
Copy link
Member Author

Hopefully superseded by #5893

@oliver-sanders oliver-sanders removed this from the cylc-8.2.4 milestone Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is wrong :( superseded
Projects
None yet
Development

Successfully merging this pull request may close these issues.

reload: reloading shakes out runahead tasks
4 participants