You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi Cylc team, hope you are well. I come to you with a feature request which would add support for another job scheduling tool which we are looking on our end to promote and improve support for integration with scheduling and workflow tools, namely CATS (https://github.com/GreenScheduler/cats), a 'green' scheduler.
As well as my specific requests detailed below, I welcome any thoughts on further ways we might be able to provide Cylc support for CATS and vice versa - where we the CATS team will take on the work (PR etc.). Thanks.
Problem
Ultimately, we'd love for there to be support to/from Cylc and CATS. The CATS scheduler is a new-ish tool myself an a small team developed to time-shift/delay a job run time in order to minimise the carbon intensity of the overall job (if a summary of CATS would be useful, as well as the docs (esp. introduction section) you could consult the slides here from a recent presentation). So, the vision is for Cylc to allow for job schedulng from CATS calculations and CATS to allow whole Cylc workflows to be time-shifted (ultimately with the goal to help reduce carbon emissions).
Based on my own ideas as to the above, specifically I was wondering if you'd consider adding built-in support for another job runner handler i.e. to those options provided as supported in the 'Supported Job Submission Methods' table.
Strictly CATS does not do the actual job running, but calculates the optimal time to run a job, so overall we are looking for further way to integrate CATS with or into workflow tools other than the primitive 'at' UNIX scheduler which is all we support at this early stage (v1.0 was released this summer) e.g. through the command argument --scheduler <scheulding/workflow tool e.g. at>, though we will soon integrate via a plugin with SLURM so also shortly support that.
With respect to Cylc I guess we could take this on a single job or full-workflow level i.e. time-shift the start time of a given (1) job, or (2) the start of the full workflow run. This PR concerns (1), because for (2) the full-workflow level, I believe we could just wrap the cylc play <all opts/args> command to support cylc workflow time-shifted start time for a --scheduler cylc option, though we welcome your thoughts about that. See my suggestion below.
As for use cases, though we appreciate NWP and time-critical operational applications probably wouldn't have use for this, there could be research and test workflows where this could be a useful (environmentally beneficial) alternative for jobs otherwise assigned to the more primitive scheduler e.g. at and background.
Proposed Solution
For (1) the job case, ideally you could support us through a new job runner handler, as detailed below. I see we can write our own custom job runner handler, but I doubt that would be widely used even if we tried to advertise the job handler code we wrote, because of the extra set up to include it in a workflow. If it was built-in, it would make it simple for Cylc users to try out and therefore hopefully encourage some usage.
My suggestion is a new built-in job runner handler cylc.flow.job_runner_handlers.cats which users could specify. It would use the at scheduler under-the-hood but at a delayed time calculated by CATS to minimise the overall carbon intensity.
Whilst at would run the job as soon as Cylc deems the task ready to run from the task conditions graph and cycle requirements, with cats specified it would effectively (or actually, depending on the implementation you advise) immediately run the CATS calculation cats --duration <expected duration> --location <postcode> --format json to return the optimal run time (in machine-parsable format i.e. JSON) and then set the task to a waiting status until the wall clock reaches that time, then finally run it.
The required minimum configuration to provide in the job runner sub-section of the relevant part of the Cylc config file would be the expected job duration (suggesting a key name duration) and postcode as a proxy for the location (suggesting postcode) .
I am happy to write the code to implement this. Please let us know your thoughts.
The text was updated successfully, but these errors were encountered:
Hi Cylc team, hope you are well. I come to you with a feature request which would add support for another job scheduling tool which we are looking on our end to promote and improve support for integration with scheduling and workflow tools, namely CATS (https://github.com/GreenScheduler/cats), a 'green' scheduler.
As well as my specific requests detailed below, I welcome any thoughts on further ways we might be able to provide Cylc support for CATS and vice versa - where we the CATS team will take on the work (PR etc.). Thanks.
Problem
Ultimately, we'd love for there to be support to/from Cylc and CATS. The CATS scheduler is a new-ish tool myself an a small team developed to time-shift/delay a job run time in order to minimise the carbon intensity of the overall job (if a summary of CATS would be useful, as well as the docs (esp. introduction section) you could consult the slides here from a recent presentation). So, the vision is for Cylc to allow for job schedulng from CATS calculations and CATS to allow whole Cylc workflows to be time-shifted (ultimately with the goal to help reduce carbon emissions).
Based on my own ideas as to the above, specifically I was wondering if you'd consider adding built-in support for another job runner handler i.e. to those options provided as supported in the 'Supported Job Submission Methods' table.
Strictly CATS does not do the actual job running, but calculates the optimal time to run a job, so overall we are looking for further way to integrate CATS with or into workflow tools other than the primitive 'at' UNIX scheduler which is all we support at this early stage (v1.0 was released this summer) e.g. through the command argument
--scheduler <scheulding/workflow tool e.g. at>
, though we will soon integrate via a plugin with SLURM so also shortly support that.With respect to Cylc I guess we could take this on a single job or full-workflow level i.e. time-shift the start time of a given (1) job, or (2) the start of the full workflow run. This PR concerns (1), because for (2) the full-workflow level, I believe we could just wrap the
cylc play <all opts/args>
command to support cylc workflow time-shifted start time for a--scheduler cylc
option, though we welcome your thoughts about that. See my suggestion below.As for use cases, though we appreciate NWP and time-critical operational applications probably wouldn't have use for this, there could be research and test workflows where this could be a useful (environmentally beneficial) alternative for jobs otherwise assigned to the more primitive scheduler e.g.
at
andbackground
.Proposed Solution
For (1) the job case, ideally you could support us through a new job runner handler, as detailed below. I see we can write our own custom job runner handler, but I doubt that would be widely used even if we tried to advertise the job handler code we wrote, because of the extra set up to include it in a workflow. If it was built-in, it would make it simple for Cylc users to try out and therefore hopefully encourage some usage.
My suggestion is a new built-in job runner handler
cylc.flow.job_runner_handlers.cats
which users could specify. It would use theat
scheduler under-the-hood but at a delayed time calculated by CATS to minimise the overall carbon intensity.Whilst
at
would run the job as soon as Cylc deems the task ready to run from the task conditions graph and cycle requirements, withcats
specified it would effectively (or actually, depending on the implementation you advise) immediately run the CATS calculationcats --duration <expected duration> --location <postcode> --format json
to return the optimal run time (in machine-parsable format i.e. JSON) and then set the task to a waiting status until the wall clock reaches that time, then finally run it.The required minimum configuration to provide in the job runner sub-section of the relevant part of the Cylc config file would be the expected job duration (suggesting a key name
duration
) and postcode as a proxy for the location (suggestingpostcode
) .I am happy to write the code to implement this. Please let us know your thoughts.
The text was updated successfully, but these errors were encountered: