Skip to content

Early Profiles

Ethan Gutmann edited this page Jul 25, 2017 · 5 revisions

For now scaling results are being tested on Cheyenne using gfortran and opencoarrays Cheyenne is a 36 core/node supercomputer at NCAR

First tests

Initial scaling results on a small domain yielded little above ~72 cores (2 nodes)

After increasing the domain size substantially, scaling continued up to 180 cores pretty easily. large domain scaling figure

Further increasing the domain size to &grid nx=128, ny=14400, nz=32 /, scaling continued up to 1000 cores, with poor scaling to 1800 cores and worse performance up to 3600 cores. larger domain scaling figure

Note that the total runtime is only ~14 of seconds on many cores, so we undoubtedly need a larger problem to test on. larger domain scaling figure

Apparently opencoarrays+gfortran+cheyenne = bad scaling of coarray allocations. An issue has been filed with opencoarrays.

2D domain decomposition

Coarray domain decomposition has now been modified to divide the domain along 2-dimensions instead of just 1. This leads to better scaling at higher core counts and normal sized domains (1024 x 1024 instead of 128 x 14400). Though surprisingly, the 1D decomposition still does very well and even beats the 2D decomposition up to ~500 processes. Presumably the 2D decomposition has to talk to nodes that are further away on the network topology, perhaps changing to 2 codimensions would let the runtime optimize that configuration somewhat.
1D vs 2D domain decomposition scaling 1D vs 2D domain decomposition runtimes

Clone this wiki locally