Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: job arrays #174

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions snakemake_executor_plugin_slurm/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

import csv
from io import StringIO
from itertools import groupby
import os
import re
import shlex
Expand Down Expand Up @@ -107,6 +108,20 @@ def warn_on_jobcontext(self, done=None):
def additional_general_args(self):
return "--executor slurm-jobstep --jobs 1"

def run_jobs(self, jobs: List[JobExecutorInterface]):
for _, same_rule_jobs in groupby(jobs, key=lambda job: job.rule.name):
if len(same_rule_jobs) == 1:
self.run_job(same_rule_jobs[0])
else:
# TODO submit as array
# share code with run_job

# TODO in the future: give a hint to the scheduler to select preferably
# many jobs from the same rule if possible, in order to have
# more efficient array jobs. This should be somehow tunable, because
# it might contradict other efficiency goals.
...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Implement placeholder for array jobs

The current implementation uses an ellipsis (...) as a placeholder, which will raise a NotImplementedError. Until the array job submission is implemented, we should handle these jobs individually.

Apply this improvement to handle multi-job groups temporarily:

         else:
             # TODO submit as array
             # share code with run_job
-            ...
+            # Temporary implementation: submit jobs individually until array support is added
+            for job in same_rule_jobs:
+                self.run_job(job)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
else:
# TODO submit as array
# share code with run_job
# TODO in the future: give a hint to the scheduler to select preferably
# many jobs from the same rule if possible, in order to have
# more efficient array jobs. This should be somehow tunable, because
# it might contradict other efficiency goals.
...
else:
# TODO submit as array
# share code with run_job
# TODO in the future: give a hint to the scheduler to select preferably
# many jobs from the same rule if possible, in order to have
# more efficient array jobs. This should be somehow tunable, because
# it might contradict other efficiency goals.
# Temporary implementation: submit jobs individually until array support is added
for job in same_rule_jobs:
self.run_job(job)

johanneskoester marked this conversation as resolved.
Show resolved Hide resolved
def run_job(self, job: JobExecutorInterface):
# Implement here how to run a job.
# You can access the job's resources, etc.
Expand Down
Loading