Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xtrigger efficiency fix. #5908

Merged
merged 7 commits into from
Jan 11, 2024
Merged

Conversation

hjoliver
Copy link
Member

@hjoliver hjoliver commented Jan 5, 2024

Update the DB insert-maps for xtriggers when the xtriggers get satisfied, not when the tasks that depend on them get satisfied.

This bug (which causes duplicate entries in lists of DB updates) can cause long delays (minutes) just after initially populating the task pool, if hundreds of tasks depend on the same xtrigger (typically, a clock trigger).

The workflow that revealed this is large with several hundred cycles of runahead, resulting in 1.8 million update entries instead of ~800 🤯

We should get this into 8.2.4 given that it has caused problems out in the wild already.

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • Tests are included (or explain why tests are not needed).
  • CHANGES.md entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

Update the DB xtriggers table when the xtriggers get satisfied, not
when the tasks that depend on them get satisfied.
@hjoliver hjoliver added the bug label Jan 5, 2024
@hjoliver hjoliver added this to the cylc-8.2.4 milestone Jan 5, 2024
@hjoliver hjoliver self-assigned this Jan 5, 2024
@hjoliver hjoliver added the efficiency For notable efficiency improvements label Jan 8, 2024
@hjoliver
Copy link
Member Author

hjoliver commented Jan 8, 2024

Ready to go. Any reviewers available at your end @oliver-sanders ? (DS is still on leave here).

cylc/flow/workflow_db_mgr.py Outdated Show resolved Hide resolved
Copy link
Member

@wxtim wxtim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Read code. Made a few comments, but none are critical to this PR.
  • Manually tested bug and fix.

😄

Copy link
Member

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though I'm not sure the DB table is doing what it's supposed to (before or after).

[unrelated] I had a go at running a workflow that calls an xtrigger which returns data ({'abc': 'def'}). Taking a look at the result of running the workflow with --main-loop='log db', it looks like there are duplicate SQL calls updating the broadcast tables (before and after). Guessing this ain't meant to happen?

INSERT                        
    OR REPLACE INTO broadcast_events
VALUES('2024-01-08T10:41:49.690744Z', '+', '20240108T0000Z', 'bar', '[environment]myxtrigger_abc', 'def')
                     
INSERT               
    OR REPLACE INTO broadcast_events
VALUES('2024-01-08T10:41:49.692659Z', '+', '20240109T0000Z', 'bar', '[environment]myxtrigger_abc', 'def')
                     
INSERT                     
    OR REPLACE INTO broadcast_events
VALUES('2024-01-08T10:41:49.694808Z', '+', '20240110T0000Z', 'bar', '[environment]myxtrigger_abc', 'def')
                           
INSERT                          
    OR REPLACE INTO broadcast_events
VALUES('2024-01-08T10:41:49.697189Z', '+', '20240111T0000Z', 'bar', '[environment]myxtrigger_abc', 'def')
                           
INSERT                        
    OR REPLACE INTO broadcast_events
VALUES('2024-01-08T10:41:49.699815Z', '+', '20240112T0000Z', 'bar', '[environment]myxtrigger_abc', 'def')
                              
INSERT                          
    OR REPLACE INTO broadcast_states
VALUES('20240108T0000Z', 'bar', '[environment]myxtrigger_abc', 'def')
                               
INSERT                        
    OR REPLACE INTO broadcast_states
VALUES('20240109T0000Z', 'bar', '[environment]myxtrigger_abc', 'def')
                              
INSERT               
    OR REPLACE INTO broadcast_states
VALUES('20240110T0000Z', 'bar', '[environment]myxtrigger_abc', 'def')
                     
INSERT               
    OR REPLACE INTO broadcast_states
VALUES('20240111T0000Z', 'bar', '[environment]myxtrigger_abc', 'def')
                     
INSERT               
    OR REPLACE INTO broadcast_states                                                                                               
VALUES('20240112T0000Z', 'bar', '[environment]myxtrigger_abc', 'def')

cylc/flow/xtrigger_mgr.py Show resolved Hide resolved
Co-authored-by: Oliver Sanders <[email protected]>
Co-authored-by: Tim Pillinger <[email protected]>
@oliver-sanders oliver-sanders merged commit 80f06f2 into cylc:8.2.x Jan 11, 2024
32 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug efficiency For notable efficiency improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants