Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for a new job runner (handler), the green scheduler CATS #6519

Open
sadielbartholomew opened this issue Dec 6, 2024 · 0 comments
Open

Comments

@sadielbartholomew
Copy link
Collaborator

Hi Cylc team, hope you are well. I come to you with a feature request which would add support for another job scheduling tool which we are looking on our end to promote and improve support for integration with scheduling and workflow tools, namely CATS (https://github.com/GreenScheduler/cats), a 'green' scheduler.

As well as my specific requests detailed below, I welcome any thoughts on further ways we might be able to provide Cylc support for CATS and vice versa - where we the CATS team will take on the work (PR etc.). Thanks.

Problem

Ultimately, we'd love for there to be support to/from Cylc and CATS. The CATS scheduler is a new-ish tool myself an a small team developed to time-shift/delay a job run time in order to minimise the carbon intensity of the overall job (if a summary of CATS would be useful, as well as the docs (esp. introduction section) you could consult the slides here from a recent presentation). So, the vision is for Cylc to allow for job schedulng from CATS calculations and CATS to allow whole Cylc workflows to be time-shifted (ultimately with the goal to help reduce carbon emissions).

Based on my own ideas as to the above, specifically I was wondering if you'd consider adding built-in support for another job runner handler i.e. to those options provided as supported in the 'Supported Job Submission Methods' table.

Strictly CATS does not do the actual job running, but calculates the optimal time to run a job, so overall we are looking for further way to integrate CATS with or into workflow tools other than the primitive 'at' UNIX scheduler which is all we support at this early stage (v1.0 was released this summer) e.g. through the command argument --scheduler <scheulding/workflow tool e.g. at>, though we will soon integrate via a plugin with SLURM so also shortly support that.

With respect to Cylc I guess we could take this on a single job or full-workflow level i.e. time-shift the start time of a given (1) job, or (2) the start of the full workflow run. This PR concerns (1), because for (2) the full-workflow level, I believe we could just wrap the cylc play <all opts/args> command to support cylc workflow time-shifted start time for a --scheduler cylc option, though we welcome your thoughts about that. See my suggestion below.

As for use cases, though we appreciate NWP and time-critical operational applications probably wouldn't have use for this, there could be research and test workflows where this could be a useful (environmentally beneficial) alternative for jobs otherwise assigned to the more primitive scheduler e.g. at and background.

Proposed Solution

For (1) the job case, ideally you could support us through a new job runner handler, as detailed below. I see we can write our own custom job runner handler, but I doubt that would be widely used even if we tried to advertise the job handler code we wrote, because of the extra set up to include it in a workflow. If it was built-in, it would make it simple for Cylc users to try out and therefore hopefully encourage some usage.

My suggestion is a new built-in job runner handler cylc.flow.job_runner_handlers.cats which users could specify. It would use the at scheduler under-the-hood but at a delayed time calculated by CATS to minimise the overall carbon intensity.

Whilst at would run the job as soon as Cylc deems the task ready to run from the task conditions graph and cycle requirements, with cats specified it would effectively (or actually, depending on the implementation you advise) immediately run the CATS calculation cats --duration <expected duration> --location <postcode> --format json to return the optimal run time (in machine-parsable format i.e. JSON) and then set the task to a waiting status until the wall clock reaches that time, then finally run it.

The required minimum configuration to provide in the job runner sub-section of the relevant part of the Cylc config file would be the expected job duration (suggesting a key name duration) and postcode as a proxy for the location (suggesting postcode) .

I am happy to write the code to implement this. Please let us know your thoughts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant