Handle batch job submissions with ease, both on remote clusters using slurm or locally in background.
- Automatic wandb group naming from config .yaml files
- Automatic sweep over hyperparameters (
sweep.foo=[1,2,3] sweep.bar=["alice","bob"]
) - Slurm jobs submission
- common config files for different scripts
- additive host.time for config files
- wandb_group_suffix for default wandb name
- test run on local machine with exps.test=true
- host.timefactor : multiply time by this factor for a host in particular
- ignore host retrieval when exps.test=true (since it does not matter)
- non-slurm background scripts submission on local machine
- handle cpu cores constraints for local non-slurm scripts
git clone <this repo>
cd exps_launcher
pip install -r requirements.txt
pip install .
Create the exps_launcher_configs
directory inside your project directory, following the structure below.
Then, copy the launch_exps.py
file inside your project directory.
export EXPS_HOSTNAME=<customhostname>
- Project structure:
.
└── exps_launcher_configs/
├── config.yaml # exps_launcher configs (can be overwritten by cli args exps.<params>)
├── host/
│ ├── default.yaml # default params for all hosts
│ ├── host2.yaml # host-specific sbatch parameters (--partition, --project, ...)
│ ├── host2.yaml
│ └── ...
├── scripts/
│ ├── script1/ # `python script1.py ...`
│ │ ├── default.yaml # default params for `script1.py`
│ │ ├── test.yaml # params for testing with `script.py`
│ │ ├── conf1.yaml # configuration
│ │ └── conf2.yaml
│ └── script2/
│ └── ...
└── sweeps/ # sweep configurations
└── fiveseeds.yaml
- Launch
launch_exps.py
script:python launch_exps.py script=script1 config=[conf1,conf2] sweep.config=[fiveseeds] sweep.foo=[1,10,100]
Notes:
- subdirs of
scripts/
must be named with the corresponding python script name. E.g. the example above refers to existing python scriptsscript1.py
andscript2.py
in your project root dir. - the
script
parameter is mandatory, and must be a single string. A single script can be launched. - A sequence of configuration files can be provided with the
config
parameter. Priority is in decreasing order, i.e. conf2 overwrites conf1 in the example above. - check out the config files above (e.g.
conf1.yaml
andconf2.yaml
) for quick examples on how to use them. - sweep parameters like
sweep.foo=[1,10,100]
can also be defined in script-specific config files (e.g. inconf1.yaml
). - host parameters like
host.time="03:00:00"
can also be defined in script-specific config files (e.g. inconf1.yaml
), which overwrite the host definitions. This way you can, e.g., specify different sbatch times for different scripts and their corresponding configurations, or different sbatch names (host.job-name="myscript")
. - you can pass
exps.hostname
to overwrite the hostname for the current experiment, e.g. to have different host configurations for the same host
Advanced commands:
- Use
exps.<param_name>=<value>
for config options. Accepted params are:exps.test=false
[launch script in test mode, using test.yaml config]exps.no_confirmation=false
[skip asking for confirmation before actually launching the experiments]exps.fake=false
[display summary but do not actually launch the experiments]exps.hostname
[overwrite hostname defined in env variable EXPS_HOSTNAME]exps.force_hostname_environ=true
[force using env variable to define the hostname]exps.group_suffix="_mySuffix"
[assumes script has a --group parameter for the wandb group name]exps.noslurm=false
[launch script locally, instead of as a slurm job]- CPU usage constraints:
- (noslurm, single script)
exps.cpus-list="50,51,52"
- (noslurm, multiple scripts)
exps.cpus-start=50 exps.cpus-per-task=4
- (noslurm, single script)
List of repositories that make use of this experiments launcher that you can use as further reference: