Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reload: reloading shakes out runahead tasks #5825

Closed
oliver-sanders opened this issue Nov 17, 2023 · 2 comments
Closed

reload: reloading shakes out runahead tasks #5825

oliver-sanders opened this issue Nov 17, 2023 · 2 comments
Assignees
Labels
Milestone

Comments

@oliver-sanders
Copy link
Member

oliver-sanders commented Nov 17, 2023

There's a nasty interaction between reload and runahead.

[scheduler]                    
    allow implicit tasks = True    
                               
[scheduling]                   
    initial cycle point = 2000
    runahead limit = P5D       
    [[graph]]                  
        P1D = """              
            a => b             
        """                    
                               
[runtime]                      
    [[root]]                   
        script = sleep 500 

When this workflow starts, the tasks 2000 through 2005 are submitted:

2023-11-17T10:05:39Z INFO - [2000/a submitted job:01 flows:1] => running
2023-11-17T10:05:39Z INFO - [2000/a running job:01 flows:1] health: execution timeout=None, polling intervals=PT1H,...
2023-11-17T10:05:39Z INFO - [2002/a submitted job:01 flows:1] => running
2023-11-17T10:05:39Z INFO - [2002/a running job:01 flows:1] health: execution timeout=None, polling intervals=PT1H,...
2023-11-17T10:05:39Z INFO - [2003/a submitted job:01 flows:1] => running
2023-11-17T10:05:39Z INFO - [2003/a running job:01 flows:1] health: execution timeout=None, polling intervals=PT1H,...
2023-11-17T10:05:39Z INFO - [2004/a submitted job:01 flows:1] => running
2023-11-17T10:05:39Z INFO - [2004/a running job:01 flows:1] health: execution timeout=None, polling intervals=PT1H,...
2023-11-17T10:05:39Z INFO - [2005/a submitted job:01 flows:1] => running
2023-11-17T10:05:39Z INFO - [2005/a running job:01 flows:1] health: execution timeout=None, polling intervals=PT1H,...

But if I reload it, the task 2006 submits:

2023-11-17T10:05:40Z INFO - Reloading task definitions.
2023-11-17T10:05:40Z INFO - [2000/a running job:01 flows:1] reloaded task definition
2023-11-17T10:05:40Z WARNING - [2000/a running job:01 flows:1] active with pre-reload settings
2023-11-17T10:05:40Z INFO - [2001/a running job:01 flows:1] reloaded task definition
2023-11-17T10:05:40Z WARNING - [2001/a running job:01 flows:1] active with pre-reload settings
2023-11-17T10:05:40Z INFO - [2002/a running job:01 flows:1] reloaded task definition
2023-11-17T10:05:40Z WARNING - [2002/a running job:01 flows:1] active with pre-reload settings
2023-11-17T10:05:40Z INFO - [2003/a running job:01 flows:1] reloaded task definition
2023-11-17T10:05:40Z WARNING - [2003/a running job:01 flows:1] active with pre-reload settings
2023-11-17T10:05:40Z INFO - [2004/a running job:01 flows:1] reloaded task definition
2023-11-17T10:05:40Z WARNING - [2004/a running job:01 flows:1] active with pre-reload settings
2023-11-17T10:05:40Z INFO - [2005/a running job:01 flows:1] reloaded task definition
2023-11-17T10:05:40Z WARNING - [2005/a running job:01 flows:1] active with pre-reload settings
2023-11-17T10:05:40Z INFO - [2006/a waiting(runahead) job:00 flows:1] reloaded task definition
2023-11-17T10:05:40Z INFO - LOADING job data
2023-11-17T10:05:40Z INFO - [2006/a waiting(runahead) job:00 flows:1] => waiting
2023-11-17T10:05:40Z INFO - [2006/a waiting job:00 flows:1] => waiting(queued)

Reload again and 2007 submits:

2023-11-17T10:05:45Z INFO - Reloading task definitions.
...
2023-11-17T10:05:45Z INFO - [2007/a waiting(runahead) job:00 flows:1] => waiting
2023-11-17T10:05:45Z INFO - [2007/a waiting job:00 flows:1] => waiting(queued)
2023-11-17T10:05:45Z INFO - Reload completed.

Reload again and 2008 submits:

2023-11-17T10:05:48Z INFO - Reloading task definitions.
...
2023-11-17T10:05:49Z INFO - [2008/a waiting(runahead) job:00 flows:1] => waiting
2023-11-17T10:05:49Z INFO - [2008/a waiting job:00 flows:1] => waiting(queued)

And so on.

@oliver-sanders
Copy link
Member Author

The issue seems to be caused by compute_runahead counting cycles with runahead tasks as active.

oliver-sanders added a commit to oliver-sanders/cylc-flow that referenced this issue Nov 21, 2023
* Closes cylc#5825
* Cycles were considered active if they contained runahead limited
  tasks.
* This could cause the runahead limit to be bumped forwards whenever the
  limit calculation was forced to update, e.g. on reload.
* This filters out tasks at or beyond the runahead limit and straigntens
  out the task status checks to match Cylc 7 behaviour in compat mode.
oliver-sanders added a commit to oliver-sanders/cylc-flow that referenced this issue Nov 22, 2023
* Closes cylc#5825
* Cycles were considered active if they contained runahead limited
  tasks.
* This could cause the runahead limit to be bumped forwards whenever the
  limit calculation was forced to update, e.g. on reload.
* This filters out tasks at or beyond the runahead limit and straigntens
  out the task status checks to match Cylc 7 behaviour in compat mode.
oliver-sanders added a commit to oliver-sanders/cylc-flow that referenced this issue Nov 28, 2023
* Closes cylc#5825
* Cycles were considered active if they contained runahead limited
  tasks.
* This could cause the runahead limit to be bumped forwards whenever the
  limit calculation was forced to update, e.g. on reload.
* This filters out tasks at or beyond the runahead limit and straigntens
  out the task status checks to match Cylc 7 behaviour in compat mode.
oliver-sanders added a commit to oliver-sanders/cylc-flow that referenced this issue Dec 20, 2023
hjoliver pushed a commit to hjoliver/cylc-flow that referenced this issue Jan 8, 2024
@hjoliver
Copy link
Member

hjoliver commented Jan 9, 2024

Closed by #5893

@hjoliver hjoliver closed this as completed Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants