You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi Daniel - just stumbled across your package - very nice! was hoping to find such an extension rather than write one from scratch :-)
in reading through the docs - one question I had for you was whether you've though much around job-specific polling intervals. In particular, thinking about how in the lifecycle of a job it would be of interest to balance keeping information fresh vs not hitting too hard on the slurmdb.
For example, a heuristic that might poll every 5-10 seconds for the first minute of a job, then back off with some sort of step function to a steady-state of every 300 seconds, which would better catch jobs that fail as they start but then as the job really gets going settle down?
Curious if this is something you've thought about much?
The text was updated successfully, but these errors were encountered:
That would certainly be a useful feature! One potential issue is detecting when jobs start. If a job is pending in the queue then you'd have to poll it more frequently to detect when it starts. The polling heuristics would be less helpful if it missed the job start time by a significant margin.
It could still be useful on systems where jobs tend to run immediately or the refresh happens to catch the beginning of a job. The "refresh curve" could apply to the polling interval for pending jobs as well.
My next objective is to add better job array support, but I can look into this after that. Thanks for the suggestion :)
Hi Daniel - just stumbled across your package - very nice! was hoping to find such an extension rather than write one from scratch :-)
in reading through the docs - one question I had for you was whether you've though much around job-specific polling intervals. In particular, thinking about how in the lifecycle of a job it would be of interest to balance keeping information fresh vs not hitting too hard on the slurmdb.
For example, a heuristic that might poll every 5-10 seconds for the first minute of a job, then back off with some sort of step function to a steady-state of every 300 seconds, which would better catch jobs that fail as they start but then as the job really gets going settle down?
Curious if this is something you've thought about much?
The text was updated successfully, but these errors were encountered: