-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify and fix runahead computation. #5893
Conversation
6164367
to
5e7f4d4
Compare
LGTM. I think this might actually close #5825. I've transferred the test from that branch here: hjoliver#36 |
|
||
if compat_mode == 'compat-mode': | ||
# Cylc 7 does not count failed tasks in runahead computation. | ||
assert int(str(schd.pool.runahead_limit_point)) == 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different from your original version of the test @oliver-sanders. Cylc 7 ignores failed tasks when computing the limit, that should include submit-failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that should include submit-failed
It would appear that Cylc 7 didn't include submit-failed.
E.G. In this workflow 2/foo will submit-fail:
[scheduling]
max active cycle points = 3
cycling mode = integer
initial cycle point = 1
[[dependencies]]
[[[P1]]]
graph = foo
[[[R1/2]]]
graph = foo[-P1] => foo
[runtime]
[[foo]]
script = """
if [[ $CYLC_TASK_CYCLE_POINT -eq 1 ]]; then
cylc broadcast "${CYLC_SUITE_NAME}" -n foo -p 2 -s '[environment]foo=$(if'
sleep 5
fi
"""
In Cylc 7 the workflow stalls because the submit dependence prevents 3/foo from being spawned. If you insert 3/foo, the workflow will stall on the runahead limit:
Whereas on this branch the workflow will run on (so no stall):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that should include submit-failed
It would appear that Cylc 7 didn't include submit-failed.
Notice that I said it should include submit-failed - I didn't actually check that it did. So thanks for checking!
OK I'll revert that to reproduce the (incorrect) Cylc 7 behavior.
assert int(str(schd.pool.runahead_limit_point)) == 4 # no change | ||
|
||
# mark cycle 1 as complete | ||
# (via task message so the task gets removed before runahead compute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new extra-simplified method assumes completed tasks have been removed already, which doesn't happen if you artificially call state_reset
in a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review so far...
- Wrote simple examples to check that both bugs were fixed.
- Read the code.
- Checked that the tests covered the two bug cases.
}, | ||
} | ||
} | ||
point = lambda point: IntegerPoint(str(int(point))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flake8:E371 is not happy with this. I'm not sure how important it is in this context, but consider...
point = lambda point: IntegerPoint(str(int(point))) | |
def point(point): return IntegerPoint(str(int(point))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flake8 won't be happy with that inline function either.
IMO it's ok, especially for a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it as-is then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flake8 won't be happy with that inline function either.
You sure? PEP8 clearly shows an example:
# Correct:
def f(x): return 2*x
and flake 8 seems OK with it.
> echo 'def point(point): return 42'> foo.py
> flake8 foo.py
...
Don't care, tis a test.
|
||
if compat_mode == 'compat-mode': | ||
# Cylc 7 does not count failed tasks in runahead computation. | ||
assert int(str(schd.pool.runahead_limit_point)) == 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that should include submit-failed
It would appear that Cylc 7 didn't include submit-failed.
E.G. In this workflow 2/foo will submit-fail:
[scheduling]
max active cycle points = 3
cycling mode = integer
initial cycle point = 1
[[dependencies]]
[[[P1]]]
graph = foo
[[[R1/2]]]
graph = foo[-P1] => foo
[runtime]
[[foo]]
script = """
if [[ $CYLC_TASK_CYCLE_POINT -eq 1 ]]; then
cylc broadcast "${CYLC_SUITE_NAME}" -n foo -p 2 -s '[environment]foo=$(if'
sleep 5
fi
"""
In Cylc 7 the workflow stalls because the submit dependence prevents 3/foo from being spawned. If you insert 3/foo, the workflow will stall on the runahead limit:
Whereas on this branch the workflow will run on (so no stall):
}, | ||
} | ||
} | ||
point = lambda point: IntegerPoint(str(int(point))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flake8 won't be happy with that inline function either.
IMO it's ok, especially for a test.
Co-authored-by: Oliver Sanders <[email protected]>
In back-compat mode the cycle point time zone is assumed to be local, whereas in normal mode it is assumed to be UTC. There was contamination of the point parse caching where the time zone would carry over from tests of back-compat vs normal mode
* We were using the pytest-env plugin to run the tests in a non-UTC time zone. * The pytest-env plugin doesn't work with pytest-xdist so this was being ignored. * Also due to the way TZ support works in Python, changing the env var whilst Python is running may or may not result in changes.
Co-authored-by: Ronnie Dutta <[email protected]>
Add compat mode and not compat mode versions of the future triggers bug test.
changing task statuses in compat mode.
238a790
to
d9c3ba3
Compare
This reverts commit 520ee29.
d9c3ba3
to
9941fad
Compare
No, that's a mistake. I think I've been stung by the behavour of WIP is investigatory work on another ticket and you definately don't want it. |
Fix bug revealed on the forum: the runahead limit point erroneously advances when the limit is specified as a time interval and future triggers are present.
Taking a closer look at the code, I realized it still contained nasty vestiges of the old "max active cycle points" computation, when we were (or at least thought we were) counting active points, rather than all possible points, beyond the base point. Probably my bad 😬
It's considerably simpler and more efficient on this branch.
[UPDATE] close #5825
Check List
CONTRIBUTING.md
and added my name as a Code Contributor.setup.cfg
(andconda-environment.yml
if present).CHANGES.md
entry included if this is a change that can affect users?.?.x
branch.