Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clearly define the difference between srun and salloc #54

Open
satyaog opened this issue Aug 17, 2021 · 5 comments
Open

Clearly define the difference between srun and salloc #54

satyaog opened this issue Aug 17, 2021 · 5 comments

Comments

@satyaog
Copy link
Member

satyaog commented Aug 17, 2021

Screen Shot 2021-08-07 at 5 11 56 PM
This is confusing. salloc was just defined previously, so saying that it "can also be used" is confusing. Perhaps it'd be better to clearly define the difference between srun and salloc? https://stackoverflow.com/questions/22152400/slurm-what-is-the-difference-for-code-executing-under-salloc-vs-srun

Originally posted by @tesfaldet in #46 (comment)

@fosterrath-mila
Copy link
Contributor

This is partially explained by recent PRs to the theoretical section. Maybe this requires more explanatory examples and reference to further docs.

@ahmam
Copy link
Contributor

ahmam commented Nov 1, 2021

salloc is used to allocate resources for a job in real time. Typically this is used to allocate resources and spawn a shell. The shell is then used to execute srun commands to launch parallel tasks.

srun is used to submit a job for execution or initiate job steps in real time. srun has a wide variety of options to specify resource requirements, including: minimum and maximum node count, processor count, specific nodes to use or not use, and specific node characteristics (so much memory, disk space, certain required features, etc.). A job can contain multiple job steps executing sequentially or in parallel on independent or shared resources within the job's node allocation.

@ahmam
Copy link
Contributor

ahmam commented Nov 1, 2021

@satyaog i will add this explanation if if this clear i will open merge request .
difference between salloc et srun

srun is used to submit a job for execution or initiate job steps in real time. srun has a wide variety of options to specify resource requirements, including: minimum and maximum node count, processor count, specific nodes to use or not use, and specific node characteristics (so much memory, disk space, certain required features, etc.). A job can contain multiple job steps executing sequentially or in parallel on independent or shared resources within the job's node allocation. Furthemore srun can also be invoked outside of a job allocation. In that case, srun requests resources, and when those resources are granted, launches tasks across those resources as a single job and job step.Whereas salloc is just used to allocate resources for job in real time.Typically this is used to allocate resources and spawn a shell. The shell is then used to execute srun commands to launch parallel tasks.

@satyaog
Copy link
Member Author

satyaog commented Dec 3, 2021

@ahmam thanks this looks good to me

@tesfaldet
Copy link
Contributor

sbatch and salloc allocate resources to a job, while srun launches parallel tasks across those resources. When invoked within a job allocation, srun will launch parallel tasks across some or all of the allocated resources. In that case, srun inherits by default the pertinent options of the sbatch or salloc which it runs under. You can then (usually) provide srun different options which will override what it receives by default. Each invocation of srun within a job is known as a job step.

srun can also be invoked outside of a job allocation. In that case, srun requests resources, and when those resources are granted, launches tasks across those resources as a single job and job step.

^Taken from the slurm-users mailing list

@btravouillon btravouillon assigned obilaniu and unassigned ahmam Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants