-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.html
145 lines (130 loc) · 15.2 KB
/
README.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
<h1 id="discovering-job-statistics-with-jobstats">Discovering job statistics with <code>jobstats</code></h1>
<p>UPPMAX provides <code>jobstats</code> to enable discovery of resource usage for jobs submitted to the SLURM job queue.</p>
<pre><code>jobstats -p [ -r ] [ -M cluster ] [options] [ jobid [ jobid ... ] | -A project | - ]</code></pre>
<p>With the <code>-p</code>/<code>--plot</code> option, a plot is produced from the jobstats for each jobid. Plots contain one panel per booked node showing CPU (blue) and memory usage (black) traces and include text lines indicating the job number, cluster, end time and duration, user, project, job name, and usage flags (more on those below). For memory usage, one or two traces are shown: a solid black line shows instantaneous memory usage, and a dotted black line shows overall maximum memory usage if this information is available.</p>
<p>Plots are saved to the current directory with the name</p>
<pre><code>cluster-project-user-jobid.png</code></pre>
<p>An example plot, this was named <code>milou-b2010042-douglas-8769275.png</code>:</p>
<div class="figure">
<img src="milou-b2010042-douglas-8769275.png" />
</div>
<p>For multiple-node jobs, plots have a two-column format.</p>
<p>Note that not all jobs will produce jobstats files, particularly if the job was cancelled or ran for less than 5 minutes. Also, if a job booked nodes inefficiently by not using nodes it asked for, jobstats files will not be available for the booked but unused nodes. For each such node the plot will show a blank panel containing the message 'node booked but unused'.</p>
<h1 id="modes-of-jobstats-discovery">Modes of jobstats discovery</h1>
<p>There are five modes for discovery, depending on what the user provides on the command line: (1) discovery by job number for a completed job; (2) discovery by job number for a currently running job; (3) discovery by node and job number, for a completed or running job; (4) discovery by project; or (5) discovery via information provided on <code>stdin</code>. In of the example command lines below, the <code>-p</code>/<code>--plot</code> option requests that plots of job resource usage are created.</p>
<p><strong>Mode 1, finished jobs, by job ID:</strong> <code>jobstats -p jobid1 jobid2 jobid3</code></p>
<p><code>finishedjobinfo</code> is used to determine further information for each job. As this can be rather time-consuming, a message is printed asking for your patience. If multiple queries are expected it is more efficient to run <code>finishedjobinfo</code> yourself separately; see Mode 4 below. See Mode 2 for a currently running job.</p>
<p><strong>Mode 2, running jobs, by job ID:</strong> <code>jobstats -p -r jobid1 jobid2 jobid3</code></p>
<p>Job numbers of jobs currently running on the cluster. The SLURM <code>squeue</code> tool is used to determine further information for each running job.</p>
<p><strong>Mode 3, single finished job when nodes are known:</strong> <code>jobstats -p -n m15,m16 jobid</code></p>
<p><code>finishedjobinfo</code> is <em>not</em> called and Uppmax's stored job statistics files for the cluster of interest are discovered directly. If you know which node(s) your job ran on or which nodes you are interested in, this will be much faster than Mode 1.</p>
<p><strong>Mode 4, all available jobs in a project:</strong> <code>jobstats -p -A project</code></p>
<p>When providing a project name that is valid for the cluster, <code>finishedjobinfo</code> is used to determine further information on jobs run within the project. As for Mode 1, this can be rather slow, and a message asking for your patience is printed. Furthermore only <code>finishedjobinfo</code> defaults for time span etc. are used for job discovery. If multiple queries are expected or additional <code>finishedjobinfo</code> options are desired, see Mode 5 below.</p>
<p><strong>Mode 5, via stdin:</strong> <code>finishedjobinfo -q project | jobstats - -p</code></p>
<p>Accept input on stdin formatted like <code>finishedjobinfo</code> output. Note the single dash <code>-</code> option given to <code>jobstats</code>; the long form of this option is <code>--stdin</code>. This mode can be especially useful if multiple queries of the same job information are expected. In this case, save the output of a single comprehensive <code>finishedjobinfo</code> query, and extract the parts of interest and present them to this script on stdin. For example, to produce analyses of all completed jobs in a project during the current calendar year, and produce separate tarballs analysing all jobs and providing jobstats plots for each user during this same period:</p>
<pre class="sourceCode bash"><code class="sourceCode bash"><span class="ot">project=</span>myproj
<span class="kw">finishedjobinfo</span> -q -y <span class="ot">${project}</span> <span class="kw">></span> <span class="ot">${project}</span>-year.txt
<span class="kw">grep</span> <span class="st">'jobstat=COMPLETED'</span> <span class="ot">${project}</span>-year.txt <span class="kw">|</span> <span class="kw">jobstats</span> - <span class="kw">></span> <span class="ot">${project}</span>-completed-jobs.txt
<span class="kw">for</span> <span class="kw">u</span> in user1 user2 user3 <span class="kw">;</span> <span class="kw">do</span>
<span class="kw">grep</span> <span class="st">"username=</span><span class="ot">${u}</span><span class="st">"</span> <span class="ot">${project}</span>-year.txt <span class="kw">|</span> <span class="kw">jobstats</span> - -p <span class="kw">></span> <span class="ot">${u}</span>-jobs.txt
<span class="kw">tar</span> czf <span class="ot">${u}</span>-jobs.tar.gz <span class="ot">${u}</span>-jobs.txt *-<span class="ot">${project}</span>-<span class="ot">${u}</span>-*.png
<span class="kw">done</span></code></pre>
<h1 id="command-line-options">Command-Line Options</h1>
<p><code>jobstats -h</code> may be specified to get detailed help including a complete list of command line options.</p>
<pre><code> -p | --plot Produce CPU and memory usage plot for each jobid
-r | --running Jobids are for jobs currently running on the cluster. The
SLURM squeue tool is used to discover further information
for the running jobs, and the rightmost extent of the plot
produced will reflect the scheduled end time of the job.
-A project Project valid on the cluster. finishedjobinfo is used to
discover jobs for the project. See further comments
under 'Mode 4' above.
-M cluster Cluster on which jobs were run [default current cluster]
-n node[,node...] Cluster node(s) on which the job was run. If specified,
then the finishedjobinfo script is not run and discovery
is restricted to only the specified nodes. Nodes can be
specified as a comma-separated list of complete node
names, or using the finishedjobinfo syntax:
m78,m90,m91,m92,m100 or m[78,90-92,100]
Nonsensical results will occur if the syntaxes are mixed.
- | --stdin Accept input on stdin formatted like finishedjobinfo
output. The short form of this option is a single dash
'-'.
-m | --memory Always include memory usage flags in output. Default
behaviour is to include memory usage flags only if CPU
usage flags are also present.
-v | --verbose Be wordy when describing flag values.
-b | --big-plot Produce 'big plot' with double the usual dimensions.
This implies '-p/--plot'.
-q | --quiet Do not produce table output
-Q | --Quick Run finishedjobinfo with the -q option, which is slightly
faster but does not include SLURM's record of maximum
memory used. With this option, memory usage analyses can
only rely upon what is reported at 5-minute intervals,
and the trace of maximum memory used (dotted black line)
is not produced.
-d Produce a header for table output
--version Produce version of jobstats and plot_jobstats, then exit
-h | --help | -? Produce detailed help information
</code></pre>
<h1 id="additional-options">Additional Options</h1>
<p>The following command-line options are generally only useful for Uppmax staff.</p>
<pre><code>--cpu-free FLOAT Maximum CPU busy percentage for the CPU to count as
free at that sampling time. Default is 3 %.
-x directory Directory prefix to use for jobstats files. Default is
'/sw/share/slurm', and directory structure is
<prefix>/<cluster>/uppmax_jobstats/<node>/<jobid>
-X directory Hard directory prefix to use for jobstats files.
Jobstats files are assumed available directly:
'<hard-prefix>/<jobid>'
--no-multijobs Run finishedjobinfo separately for each jobid, rather
than once with all jobids bundled into one -j option
-f file finishedjobinfo script
-P file plot_jobstats script</code></pre>
<h1 id="further-details">Further Details</h1>
<p>This script produces two types of output. If the <code>-p</code>/<code>--plot</code> command line option is provided, a plot is created of core and memory usage across the life of the job. The name of the file produced has the format:</p>
<pre><code>cluster-jobid-project-user.png</code></pre>
<p>Unless the <code>-q</code>/<code>--quiet</code> option is provided, a table is also produces containing lines with the following tab-separated fields:</p>
<pre><code>jobid cluster jobstate user project endtime runtime flags booked cores node[,node...] jobstats[,jobstats...] </code></pre>
<p>Field contents:</p>
<ul>
<li><code>jobid</code> : Job ID</li>
<li><code>cluster</code> : Cluster on which the job was run</li>
<li><code>jobstate</code> : End status of the job: COMPLETED, RUNNING, FAILED, TIMEOUT, CANCELLED</li>
<li><code>user</code> : Username that submitted the job</li>
<li><code>project</code> : Project account under which the job was run</li>
<li><code>endtime</code> : End time of the job (with <code>-n/--node</code>, this is <code>.</code>) For running jobs, this is appended with <code>(sched)</code> and surrounded with single quotes.</li>
<li><code>runtime</code> : Runtime of the job (with <code>-n/--node</code>, this is <code>.</code>)</li>
<li><code>flags</code> : Flags indicating various types of resource underutilizations</li>
<li><code>booked</code> : Number of booked cores (with <code>-n/--node</code>, this is <code>.</code>)</li>
<li><code>maxmem</code> : Maximum memory used as reported by SLURM (if unavailable, this is <code>.</code>)</li>
<li><code>cores</code> : Number of cores represented in the discovered jobstats files.</li>
<li><code>node</code> : Node(s) booked for the job, expanded into individual node names, separated by commas; if no nodes were found, this is <code>.</code>. The nodes for which jobstats files are available are listed first.</li>
<li><code>jobstats</code> : jobstats files for the nodes, in the same order the nodes are listed, separated by commas; if no jobstats files were discovered, this is <code>.</code></li>
</ul>
<p>If <code>-r</code>/<code>--running</code> was used, an additional field is present:</p>
<ul>
<li><code>timelimit_minutes</code> : The time limit of the job in minutes</li>
</ul>
<p>At completion of the script, a brief summary is produced:</p>
<pre><code>*** No jobstats files found for 25 out of 56 jobs, limited resource usage diagnosis and no plot produced</code></pre>
<h1 id="flags">Flags</h1>
<p>An important part of <code>jobstats</code> output are usage flags. These provide indications that booked resources might have been underused, either processor cores, or memory, or both.</p>
<p>In both plot and table output, flags are a comma-separated list of cautions regarding core and/or memory underutilisation. The appearance of a flag does not necessarily mean that resources were used incorrectly. It depends upon the tools being used and the contents of the SLURM header, and also depends upon the job profile. Because usage information is gathered every 5 minutes, higher transient usage of cores or memory may not be captured in the log files.</p>
<p>Flags most likely to represent real overbooking of resources are <code>nodes_overbooked</code>, <code>overbooked</code>, <code>!!half_overbooked</code>, <code>!!severely_overbooked</code>, and <code>!!swap_used</code>.</p>
<p>For jobs that require a larger node than the default, for example the <code>-C mem256GB</code> flag was used while booking the job with SLURM and more than 128GB was actually used by the job, then all of the flags pertaining to partial-node booking are disabled. It is not useful to issue such cautions because it is not possible to book just a portion of such nodes at UPPMAX.</p>
<p>For multinode jobs, flags other than <code>nodes_overbooked</code> are determined based only on the usage of the first node. Multinode jobs require careful analysis so as to not waste resources unnecessarily, and it is a common mistake among beginning Uppmax users to book multiple nodes and run tools that cannot use more than the first. In this case, <code>nodes_overbooked</code> will appear.</p>
<p>Some flags have a threshold below which they appear. The default format is generally <code>flag:value-booked:value-used</code>.</p>
<ul>
<li><code>nodes_overbooked : nodes booked : nodes used</code> : More nodes were booked than used</li>
<li><code>overbooked : % used</code> : The maximum percentage of booked cores and/or memory that was used (if < 80%)</li>
<li><code>!!half_overbooked</code> : No more than one-half of both cores and memory of a node was used; consider booking half a node instead.</li>
<li><code>!!severely_overbooked</code> : No more than one-quarter of both cores and memory of a node was used, examine your job requirements closely.</li>
<li><code>!!swap_used</code> : Swap storage was used at any point within the job run</li>
<li><code>node_type_overbooked : type booked : type used</code> : A fat node was requested that was larger than was needed. This flag may be produced spuriously if SLURM ran the job on a fat node when a fat node was not requested by the user.</li>
<li><code>cores_overbooked : cores booked : cores used</code> : More cores were booked than used (if < 80%)</li>
<li><code>mem_overbooked : GB booked : GB used</code> : More memory was available than was used (if < 25% and more than one core).</li>
<li><code>core_mem_overbooked : GB in used cores : GB used</code> : Less memory was used than was available in the cores that were used (if < 50%).</li>
</ul>
<p>By default no flags are indicated for jobs with memory-only cautions except for swap usage, because it is common for jobs to heavily use processor cores without using a sizable fraction of memory. Use the <code>-m</code>/<code>--memory</code> option to include flags for memory underutilisation when those would be the only flags produced.</p>
<p>More verbose flags are output with the <code>-v</code>/<code>--verbose</code> option.</p>