-
Notifications
You must be signed in to change notification settings - Fork 0
/
md_0_1_Version2019.3.log
602 lines (518 loc) · 26.7 KB
/
md_0_1_Version2019.3.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
:-) GROMACS - gmx mdrun, 2019.3 (-:
GROMACS is written by:
Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen
Par Bjelkmar Christian Blau Viacheslav Bolnykh Kevin Boyd
Aldert van Buuren Rudi van Drunen Anton Feenstra Alan Gray
Gerrit Groenhof Anca Hamuraru Vincent Hindriksen M. Eric Irrgang
Aleksei Iupinov Christoph Junghans Joe Jordan Dimitrios Karkoulis
Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson
Justin A. Lemkul Viveca Lindahl Magnus Lundborg Erik Marklund
Pascal Merz Pieter Meulenhoff Teemu Murtola Szilard Pall
Sander Pronk Roland Schulz Michael Shirts Alexey Shvetsov
Alfons Sijbers Peter Tieleman Jon Vincent Teemu Virolainen
Christian Wennberg Maarten Wolf
and the project leaders:
Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2018, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.
GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.
GROMACS: gmx mdrun, version 2019.3
Executable: /shared/ucl/apps/gromacs/2019.3/intel-2018/bin/gmx
Data prefix: /shared/ucl/apps/gromacs/2019.3/intel-2018
Working dir: /lustre/scratch/scratch/ucbechz/test/8_version2019_short
Process ID: 227853
Command line:
gmx mdrun -deffnm md_0_1 -cpi -append
GROMACS version: 2019.3
Precision: single
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support: disabled
SIMD instructions: AVX_512
FFT library: Intel MKL
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
C compiler: /shared/ucl/apps/intel/2018.Update3/compilers_and_libraries_2018.3.222/linux/bin/intel64/icc Intel 18.0.3.20180410
C compiler flags: -xCORE-AVX512 -qopt-zmm-usage=high -mkl=sequential -std=gnu99 -O3 -DNDEBUG -ip -funroll-all-loops -alias-const -ansi-alias -no-prec-div -fimf-domain-exclusion=14 -qoverride-limits
C++ compiler: /shared/ucl/apps/intel/2018.Update3/compilers_and_libraries_2018.3.222/linux/bin/intel64/icpc Intel 18.0.3.20180410
C++ compiler flags: -xCORE-AVX512 -qopt-zmm-usage=high -mkl=sequential -std=c++11 -O3 -DNDEBUG -ip -funroll-all-loops -alias-const -ansi-alias -no-prec-div -fimf-domain-exclusion=14 -qoverride-limits
Running on 1 node with total 36 cores, 36 logical cores
Hardware detected:
CPU info:
Vendor: Intel
Brand: Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
Family: 6 Model: 85 Stepping: 4
Features: aes apic avx avx2 avx512f avx512cd avx512bw avx512vl clfsh cmov cx8 cx16 f16c fma hle htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
Number of AVX-512 FMA units: 2
Hardware topology: Only logical processor count
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, E.
Lindahl
GROMACS: High performance molecular simulations through multi-level
parallelism from laptops to supercomputers
SoftwareX 1 (2015) pp. 19-25
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Páll, M. J. Abraham, C. Kutzner, B. Hess, E. Lindahl
Tackling Exascale Software Challenges in Molecular Dynamics Simulations with
GROMACS
In S. Markidis & E. Laure (Eds.), Solving Software Challenges for Exascale 8759 (2015) pp. 3-27
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Pronk, S. Páll, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R.
Shirts, J. C. Smith, P. M. Kasson, D. van der Spoel, B. Hess, and E. Lindahl
GROMACS 4.5: a high-throughput and highly parallel open source molecular
simulation toolkit
Bioinformatics 29 (2013) pp. 845-54
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
molecular simulation
J. Chem. Theory Comput. 4 (2008) pp. 435-447
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
Berendsen
GROMACS: Fast, Flexible and Free
J. Comp. Chem. 26 (2005) pp. 1701-1719
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
E. Lindahl and B. Hess and D. van der Spoel
GROMACS 3.0: A package for molecular simulation and trajectory analysis
J. Mol. Mod. 7 (2001) pp. 306-317
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, D. van der Spoel and R. van Drunen
GROMACS: A message-passing parallel molecular dynamics implementation
Comp. Phys. Comm. 91 (1995) pp. 43-56
-------- -------- --- Thank You --- -------- --------
++++ PLEASE CITE THE DOI FOR THIS VERSION OF GROMACS ++++
https://doi.org/10.5281/zenodo.3243833
-------- -------- --- Thank You --- -------- --------
Non-default thread affinity set, disabling internal thread affinity
Input Parameters:
integrator = md
tinit = 0
dt = 0.002
nsteps = 1000
init-step = 0
simulation-part = 1
comm-mode = Linear
nstcomm = 100
bd-fric = 0
ld-seed = -2014963481
emtol = 10
emstep = 0.01
niter = 20
fcstep = 0
nstcgsteep = 1000
nbfgscorr = 10
rtpi = 0.05
nstxout = 50000
nstvout = 0
nstfout = 0
nstlog = 50000
nstcalcenergy = 100
nstenergy = 50000
nstxout-compressed = 50000
compressed-x-precision = 1000
cutoff-scheme = Verlet
nstlist = 10
ns-type = Grid
pbc = xyz
periodic-molecules = false
verlet-buffer-tolerance = 0.005
rlist = 1
coulombtype = PME
coulomb-modifier = Potential-shift
rcoulomb-switch = 0
rcoulomb = 1
epsilon-r = 1
epsilon-rf = inf
vdw-type = Cut-off
vdw-modifier = Potential-shift
rvdw-switch = 0
rvdw = 1
DispCorr = EnerPres
table-extension = 1
fourierspacing = 0.16
fourier-nx = 72
fourier-ny = 72
fourier-nz = 72
pme-order = 4
ewald-rtol = 1e-05
ewald-rtol-lj = 0.001
lj-pme-comb-rule = Geometric
ewald-geometry = 0
epsilon-surface = 0
tcoupl = V-rescale
nsttcouple = 10
nh-chain-length = 0
print-nose-hoover-chain-variables = false
pcoupl = Parrinello-Rahman
pcoupltype = Isotropic
nstpcouple = 10
tau-p = 2
compressibility (3x3):
compressibility[ 0]={ 4.50000e-05, 0.00000e+00, 0.00000e+00}
compressibility[ 1]={ 0.00000e+00, 4.50000e-05, 0.00000e+00}
compressibility[ 2]={ 0.00000e+00, 0.00000e+00, 4.50000e-05}
ref-p (3x3):
ref-p[ 0]={ 1.00000e+00, 0.00000e+00, 0.00000e+00}
ref-p[ 1]={ 0.00000e+00, 1.00000e+00, 0.00000e+00}
ref-p[ 2]={ 0.00000e+00, 0.00000e+00, 1.00000e+00}
refcoord-scaling = No
posres-com (3):
posres-com[0]= 0.00000e+00
posres-com[1]= 0.00000e+00
posres-com[2]= 0.00000e+00
posres-comB (3):
posres-comB[0]= 0.00000e+00
posres-comB[1]= 0.00000e+00
posres-comB[2]= 0.00000e+00
QMMM = false
QMconstraints = 0
QMMMscheme = 0
MMChargeScaleFactor = 1
qm-opts:
ngQM = 0
constraint-algorithm = Lincs
continuation = true
Shake-SOR = false
shake-tol = 0.0001
lincs-order = 4
lincs-iter = 1
lincs-warnangle = 30
nwall = 0
wall-type = 9-3
wall-r-linpot = -1
wall-atomtype[0] = -1
wall-atomtype[1] = -1
wall-density[0] = 0
wall-density[1] = 0
wall-ewald-zfac = 3
pull = false
awh = false
rotation = false
interactiveMD = false
disre = No
disre-weighting = Conservative
disre-mixed = false
dr-fc = 1000
dr-tau = 0
nstdisreout = 100
orire-fc = 0
orire-tau = 0
nstorireout = 100
free-energy = no
cos-acceleration = 0
deform (3x3):
deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
simulated-tempering = false
swapcoords = no
userint1 = 0
userint2 = 0
userint3 = 0
userint4 = 0
userreal1 = 0
userreal2 = 0
userreal3 = 0
userreal4 = 0
applied-forces:
electric-field:
x:
E0 = 0
omega = 0
t0 = 0
sigma = 0
y:
E0 = 0
omega = 0
t0 = 0
sigma = 0
z:
E0 = 0
omega = 0
t0 = 0
sigma = 0
grpopts:
nrdf: 13160.8 222576
ref-t: 300 300
tau-t: 0.1 0.1
annealing: No No
annealing-npoints: 0 0
acc: 0 0 0
nfreeze: N N N
energygrp-flags[ 0]: 0
Changing nstlist from 10 to 50, rlist from 1 to 1.116
Initializing Domain Decomposition on 36 ranks
Dynamic load balancing: locked
Minimum cell size due to atom displacement: 0.479 nm
Initial maximum distances in bonded interactions:
two-body bonded interactions: 0.454 nm, LJ-14, atoms 3576 4713
multi-body bonded interactions: 0.454 nm, Ryckaert-Bell., atoms 3576 4713
Minimum cell size due to bonded interactions: 0.499 nm
Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.872 nm
Estimated maximum distance required for P-LINCS: 0.872 nm
This distance will limit the DD cell size, you can override this with -rcon
Guess for relative PME load: 0.24
Will use 27 particle-particle and 9 PME only ranks
This is a guess, check the performance at the end of the log file
Using 9 separate PME ranks, as guessed by mdrun
Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25
Optimizing the DD grid for 27 cells with a minimum initial size of 1.089 nm
The maximum allowed number of cells is: X 9 Y 9 Z 9
Domain decomposition grid 3 x 3 x 3, separate PME ranks 9
PME domain decomposition: 3 x 3 x 1
Interleaving PP and PME ranks
This rank does only particle-particle work.
Domain decomposition rank 0, coordinates 0 0 0
The initial number of communication pulses is: X 1 Y 1 Z 1
The initial domain decomposition cell size is: X 3.51 nm Y 3.51 nm Z 3.51 nm
The maximum allowed distance for atoms involved in interactions is:
non-bonded interactions 1.116 nm
(the following are initial values, they could change due to box deformation)
two-body bonded interactions (-rdd) 1.116 nm
multi-body bonded interactions (-rdd) 1.116 nm
atoms separated by up to 5 constraints (-rcon) 3.511 nm
When dynamic load balancing gets turned on, these settings will change to:
The maximum number of communication pulses is: X 1 Y 1 Z 1
The minimum size for domain decomposition cells is 1.116 nm
The requested allowed shrink of DD cells (option -dds) is: 0.80
The allowed shrink of domain decomposition cells is: X 0.32 Y 0.32 Z 0.32
The maximum allowed distance for atoms involved in interactions is:
non-bonded interactions 1.116 nm
two-body bonded interactions (-rdd) 1.116 nm
multi-body bonded interactions (-rdd) 1.116 nm
atoms separated by up to 5 constraints (-rcon) 1.116 nm
Using 36 MPI threads
Using 1 OpenMP thread per tMPI thread
System total charge: -0.000
Will do PME sum in reciprocal space for electrostatic interactions.
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- --- Thank You --- -------- --------
Using a Gaussian width (1/beta) of 0.320163 nm for Ewald
Potential shift: LJ r^-12: -1.000e+00 r^-6: -1.000e+00, Ewald -1.000e-05
Initialized non-bonded Ewald correction tables, spacing: 9.33e-04 size: 1073
Long Range LJ corr.: <C6> 3.1893e-04
Generated table with 1058 data points for Ewald.
Tabscale = 500 points/nm
Generated table with 1058 data points for LJ6.
Tabscale = 500 points/nm
Generated table with 1058 data points for LJ12.
Tabscale = 500 points/nm
Generated table with 1058 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1058 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1058 data points for 1-4 LJ12.
Tabscale = 500 points/nm
Using SIMD 4x8 nonbonded short-range kernels
Using a dual 4x8 pair-list setup updated with dynamic pruning:
outer list: updated every 50 steps, buffer 0.116 nm, rlist 1.116 nm
inner list: updated every 12 steps, buffer 0.003 nm, rlist 1.003 nm
At tolerance 0.005 kJ/mol/ps per atom, equivalent classical 1x1 list would be:
outer list: updated every 50 steps, buffer 0.245 nm, rlist 1.245 nm
inner list: updated every 12 steps, buffer 0.048 nm, rlist 1.048 nm
Using geometric Lennard-Jones combination rule
Initializing Parallel LINear Constraint Solver
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess
P-LINCS: A Parallel Linear Constraint Solver for molecular simulation
J. Chem. Theory Comput. 4 (2008) pp. 116-122
-------- -------- --- Thank You --- -------- --------
The number of constraints is 6705
There are constraints between atoms in different decomposition domains,
will communicate selected coordinates each lincs iteration
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Miyamoto and P. A. Kollman
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
Water Models
J. Comp. Chem. 13 (1992) pp. 952-962
-------- -------- --- Thank You --- -------- --------
Linking all bonded interactions to atoms
Intra-simulation communication will occur every 10 steps.
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
0: rest
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
G. Bussi, D. Donadio and M. Parrinello
Canonical sampling through velocity rescaling
J. Chem. Phys. 126 (2007) pp. 014101
-------- -------- --- Thank You --- -------- --------
There are: 117860 Atoms
Atom distribution over 27 domains: av 4365 stddev 75 min 4289 max 4626
Started mdrun on rank 0 Thu Jan 16 13:44:19 2020
Step Time
0 0.00000
Energies (kJ/mol)
Angle Proper Dih. Ryckaert-Bell. LJ-14 Coulomb-14
1.31131e+04 7.43495e+02 6.17820e+03 9.77778e+03 5.15413e+04
LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. Potential
3.28241e+05 -1.58634e+04 -2.24237e+06 1.06330e+04 -1.83800e+06
Kinetic En. Total Energy Conserved En. Temperature Pres. DC (bar)
2.93000e+05 -1.54500e+06 -1.54493e+06 2.98976e+02 -2.25711e+02
Pressure (bar) Constr. rmsd
1.60558e+02 2.70869e-05
DD step 49 load imb.: force 9.1% pme mesh/force 5.439
step 200: timed with pme grid 72 72 72, coulomb cutoff 1.000: 68982.8 M-cycles
step 300: timed with pme grid 60 60 60, coulomb cutoff 1.097: 66679.8 M-cycles
step 400: timed with pme grid 52 52 52, coulomb cutoff 1.266: 65526.9 M-cycles
step 500: timed with pme grid 48 48 48, coulomb cutoff 1.371: 66567.2 M-cycles
step 600: timed with pme grid 44 44 44, coulomb cutoff 1.496: 68166.1 M-cycles
step 600: the maximum allowed grid scaling limits the PME load balancing to a coulomb cut-off of 1.496
step 700: timed with pme grid 44 44 44, coulomb cutoff 1.496: 67970.5 M-cycles
step 800: timed with pme grid 48 48 48, coulomb cutoff 1.371: 64550.7 M-cycles
step 900: timed with pme grid 52 52 52, coulomb cutoff 1.266: 76490.0 M-cycles
step 1000: timed with pme grid 56 56 56, coulomb cutoff 1.176: 61391.8 M-cycles
DD step 999 load imb.: force 8.2% pme mesh/force 5.479
Step Time
1000 2.00000
Writing checkpoint, step 1000 at Thu Jan 16 13:58:00 2020
Energies (kJ/mol)
Angle Proper Dih. Ryckaert-Bell. LJ-14 Coulomb-14
1.35754e+04 7.46147e+02 6.14961e+03 9.67514e+03 5.14736e+04
LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. Potential
3.27730e+05 -1.58649e+04 -2.24139e+06 7.58552e+03 -1.84032e+06
Kinetic En. Total Energy Conserved En. Temperature Pres. DC (bar)
2.94477e+05 -1.54585e+06 -1.54492e+06 3.00483e+02 -2.25755e+02
Pressure (bar) Constr. rmsd
-8.20775e+00 2.79814e-05
<====== ############### ==>
<==== A V E R A G E S ====>
<== ############### ======>
Statistics over 1001 steps using 11 frames
Energies (kJ/mol)
Angle Proper Dih. Ryckaert-Bell. LJ-14 Coulomb-14
1.35406e+04 7.63420e+02 6.17614e+03 9.69602e+03 5.15834e+04
LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. Potential
3.27240e+05 -1.58532e+04 -2.23752e+06 5.92859e+03 -1.83844e+06
Kinetic En. Total Energy Conserved En. Temperature Pres. DC (bar)
2.93864e+05 -1.54458e+06 -1.54490e+06 2.99857e+02 -2.25421e+02
Pressure (bar) Constr. rmsd
1.49614e+01 0.00000e+00
Box-X Box-Y Box-Z
1.05349e+01 1.05349e+01 1.05349e+01
Total Virial (kJ/mol)
9.82537e+04 9.35636e+01 -6.24085e+02
9.30723e+01 9.85424e+04 -1.02442e+02
-6.43008e+02 -1.11490e+02 9.54891e+04
Pressure (bar)
-1.02708e+01 -1.06682e+00 1.89408e+01
-1.05313e+00 -1.32159e+01 7.06726e-01
1.94782e+01 9.63780e-01 6.83708e+01
T-Protein T-non-Protein
3.00223e+02 2.99835e+02
P P - P M E L O A D B A L A N C I N G
PP/PME load balancing changed the cut-off and PME settings:
particle-particle PME
rcoulomb rlist grid spacing 1/beta
initial 1.000 nm 1.003 nm 72 72 72 0.146 nm 0.320 nm
final 1.097 nm 1.100 nm 60 60 60 0.176 nm 0.351 nm
cost-ratio 1.32 0.58
(note that these numbers concern only part of the total PP and PME load)
M E G A - F L O P S A C C O U N T I N G
NB=Group-cutoff nonbonded kernels NxN=N-by-N cluster Verlet kernels
RF=Reaction-Field VdW=Van der Waals QSTab=quadratic-spline table
W3=SPC/TIP3p W4=TIP4p (single or pairs)
V&F=Potential and force V=Potential only F=Force only
Computing: M-Number M-Flops % Flops
-----------------------------------------------------------------------------
Pair Search distance check 1479.179112 13312.612 0.2
NxN Ewald Elec. + LJ [F] 64952.283168 4286850.689 53.1
NxN Ewald Elec. + LJ [V&F] 704.180320 75347.294 0.9
NxN Ewald Elec. [F] 57410.987840 3502070.258 43.3
NxN Ewald Elec. [V&F] 622.294592 52272.746 0.6
1,4 nonbonded interactions 17.508491 1575.764 0.0
Calc Weights 353.933580 12741.609 0.2
Spread Q Bspline 7550.583040 15101.166 0.2
Gather F Bspline 7550.583040 45303.498 0.6
3D-FFT 6409.683730 51277.470 0.6
Solve PME 9.308400 595.738 0.0
Reset In Box 2.357200 7.072 0.0
CG-CoM 2.475060 7.425 0.0
Angles 12.094082 2031.806 0.0
Propers 1.324323 303.270 0.0
RB-Dihedrals 13.646633 3370.718 0.0
Virial 12.026575 216.478 0.0
Stop-CM 1.296460 12.965 0.0
Calc-Ekin 23.807720 642.808 0.0
Lincs 10.642253 638.535 0.0
Lincs-Mat 227.786484 911.146 0.0
Constraint-V 141.053014 1128.424 0.0
Constraint-Vir 13.158491 315.804 0.0
Settle 39.922836 12895.076 0.2
-----------------------------------------------------------------------------
Total 8078930.371 100.0
-----------------------------------------------------------------------------
D O M A I N D E C O M P O S I T I O N S T A T I S T I C S
av. #atoms communicated per step for force: 2 x 183748.9
av. #atoms communicated per step for LINCS: 2 x 12691.3
Dynamic load balancing report:
DLB was off during the run due to low measured imbalance.
Average load imbalance: 9.4%.
The balanceable part of the MD step is 15%, load imbalance is computed from this.
Part of the total run time spent waiting due to load imbalance: 1.4%.
Average PME mesh/force load: 5.727
Part of the total run time spent waiting due to PP/PME imbalance: 45.5 %
NOTE: 45.5 % performance was lost because the PME ranks
had more work to do than the PP ranks.
You might want to increase the number of PME ranks
or increase the cut-off and the grid spacing.
R E A L C Y C L E A N D T I M E A C C O U N T I N G
On 27 MPI ranks doing PP, and
on 9 MPI ranks doing PME
Computing: Num Num Call Wall time Giga-Cycles
Ranks Threads Count (s) total sum %
-----------------------------------------------------------------------------
Domain decomp. 27 1 20 34.884 2160.902 3.2
DD comm. load 27 1 3 0.074 4.574 0.0
Send X to PME 27 1 1001 17.388 1077.119 1.6
Neighbor search 27 1 21 3.882 240.490 0.4
Comm. coord. 27 1 980 75.232 4660.221 6.9
Force 27 1 1001 7.454 461.720 0.7
Wait + Comm. F 27 1 1001 70.863 4389.594 6.5
PME mesh * 9 1 1001 595.642 12299.016 18.1
PME wait for PP * 226.611 4679.132 6.9
Wait + Recv. PME F 27 1 1001 444.040 27506.032 40.5
NB X/F buffer ops. 27 1 2961 0.101 6.242 0.0
Write traj. 27 1 2 0.458 28.378 0.0
Update 27 1 1001 0.034 2.106 0.0
Constraints 27 1 1001 154.829 9590.890 14.1
Comm. energies 27 1 101 10.210 632.489 0.9
Rest 2.971 184.059 0.3
-----------------------------------------------------------------------------
Total 822.420 67926.422 100.0
-----------------------------------------------------------------------------
(*) Note that with separate PME ranks, the walltime column actually sums to
twice the total reported, but the cycle count total and % are correct.
-----------------------------------------------------------------------------
Breakdown of PME mesh computation
-----------------------------------------------------------------------------
PME redist. X/F 9 1 2002 357.774 7387.441 10.9
PME spread 9 1 1001 80.813 1668.646 2.5
PME gather 9 1 1001 49.100 1013.830 1.5
PME 3D-FFT 9 1 2002 0.778 16.061 0.0
PME 3D-FFT Comm. 9 1 4004 107.116 2211.765 3.3
PME solve Elec 9 1 1001 0.038 0.786 0.0
-----------------------------------------------------------------------------
Core t (s) Wall t (s) (%)
Time: 29602.355 822.420 3599.4
(ns/day) (hour/ns)
Performance: 0.210 114.111
Finished mdrun on rank 0 Thu Jan 16 13:58:02 2020