forked from adaptivecomputing/torque
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.array_changes
131 lines (90 loc) · 4.85 KB
/
README.array_changes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
This file contains information concerning the use of the new job array
features in TORQUE 2.5.
--- WARNING ---
TORQUE 2.5 uses a new format for job arrays. It in not backwards
compatible with job arrays from version 2.3 or 2.4. Therefore, it is
imperative that the system be drained of any job arrays BEFORE upgrading.
Upgrading with job arrays queued or running may cause data loss, crashes,
etc, and is not supported.
COMMAND UPDATES FOR ARRAYS
--------------------------
The commands qalter, qdel, qhold, and qrls now all support TORQUE
arrays and will have to be updated. The general command syntax is:
command <array_name> [-t array_range] [other command options]
The array ranges accepted by -t here are exactly the same as the array ranges
that can be specified in qsub.
(http://www.clusterresources.com/products/torque/docs/commands/qsub.shtml)
SLOT LIMITS
--------------------------
It is now possible to limit the number of jobs that can run concurrently in a
job array. This is called a slot limit, and the default is unlimited. The slot
limit can be set in two ways.
The first method can be done at job submission:
qsub script.sh -t 0-299%5
This sets the slot limit to 5, meaning only 5 jobs from this array can be
running at the same time.
The second method can be done on a server wide basis using the server
parameter max_slot_limit. Since administrators are more likely to
be concerned with limiting arrays than users in many cases the
max_slot_limit parameter is a convenient way to set a global policy.
If max_slot_limit is not set then the default limit is unlimited. To set
max_slot_limit you can use the following queue manager command.
qmgr -c 'set server max_slot_limit=10'
This means that no array can request a slot limit greater than 10, and
any array not requesting a slot limit will receive a slot limit of 10.
If a user requests a slot limit greater than 10, the job will be
rejected with the message:
Requested slot limit is too large, limit is X. In this case, X would be 10.
It is recommended that if you are using torque with a scheduler like Moab or Maui
that you also set the server parameter moab_array_compatible=true. Setting
moab_array_compatible will put all jobs over the slot limit on hold
so the scheduler will not try and schedule jobs above the slot limit.
JOB ARRAY DEPENDENCIES
--------------------------
The following dependencies can now be used for job arrays:
afterstartarray
afterokarray
afternotokarray
afteranyarray
beforestartarray
beforeokarray
beforenotokarray
beforeanyarray
The general syntax is:
qsub script.sh -W depend=dependtype:array_name[num_jobs]
The suffix [num_jobs] should appear exactly as above, although the number of
jobs is optional. If it isn't specified, the dependency will assume that it is
the entire array, for example:
qsub script.sh -W depend=afterokarray:427[]
will assume every job in array 427[] has to finish successfully for the
dependency to be satisfied. The submission:
qsub script.sh -W depend=afterokarray:427[][5]
means that 5 of the jobs in array 427 have to successfully finish in order for
the dependency to be satisfied.
NOTE: It is important to remember that the "[]" is part of the array name.
QSTAT FOR JOB ARRAYS
--------------------------
Normal qstat output will display a summary of the array instead of displaying
the entire array, job for job.
qstat -t will expand the output to display the entire array.
ARRAY NAMING CONVENTION
--------------------------
Arrays are now named with brackets following the array name, for example:
dbeer@napali:~/dev/torque/array_changes$ echo sleep 20 | qsub -t 0-299
189[].napali
Individual jobs in the array are now also noted using square brackets instead of
dashes, for example, here is part of the output of qstat -t for the above
array:
189[287].napali STDIN[287] dbeer 0 Q batch
189[288].napali STDIN[288] dbeer 0 Q batch
189[289].napali STDIN[289] dbeer 0 Q batch
189[290].napali STDIN[290] dbeer 0 Q batch
189[291].napali STDIN[291] dbeer 0 Q batch
189[292].napali STDIN[292] dbeer 0 Q batch
189[293].napali STDIN[293] dbeer 0 Q batch
189[294].napali STDIN[294] dbeer 0 Q batch
189[295].napali STDIN[295] dbeer 0 Q batch
189[296].napali STDIN[296] dbeer 0 Q batch
189[297].napali STDIN[297] dbeer 0 Q batch
189[298].napali STDIN[298] dbeer 0 Q batch
189[299].napali STDIN[299] dbeer 0 Q batch