Skip to content

Commit

Permalink
feat: refactor core/thread logic for mpibackend
Browse files Browse the repository at this point in the history
This takes George's old GUI-specific `_available_cores()` method, moves
it, and greatly expands it to include updates to the logic about cores
and hardware-threading which was previously inside
`MPIBackend.__init__()`. This was necessary due to the number of common
but different outcomes based on platform, architecture,
hardware-threading support, and user choice. These changes do not
involve very many lines of code, but a good amount of thought and
testing has gone into them. Importantly, these `MPIBackend` API changes
are backwards-compatible, and no changes to current usage code are
needed. I suggest you read the long comments in
`parallel_backends.py::_determine_cores_hwthreading()` outlining how
each variation is handled.

Previously, if the user did not provide the number of MPI Processes they
wanted to use, `MPIBackend` assumed that the number of detected
"logical" cores would suffice. As George previously showed, this does
not work for HPC environments like on OSCAR, where the only true number
of cores that we are allowed to use is found by
`psutil.Process().cpu_affinity()`, the "affinity" core number. There is
a third type of number of cores besides "logical" and "affinity" which
is important: "physical". However, there was an additional problem here
that was still unaddressed: hardware-threading. Different platforms and
situations report different numbers of logical, affinity, and physical
CPU cores. One of the factors that affects this is if there is
hardware-threading present on the machine, such as Intel
Hyperthreading. In the case of an example Linux laptop having an Intel
chip with Hyperthreading, the logical and physical core numbers will
report different values with respect to each other: logical includes
Hyperthreads
(e.g. `psutil.cpu_count(logical=True)` reports 8 cores), but physical
does not
(e.g. `psutil.cpu_count(logical=False)` reports 4 cores). If we tell MPI
to use 8 cores ("logical"), then we ALSO need to tell it to also enable
the hardware-threading option. However, if the user does not want to
enable hardware-threading, then we need to make this an option, tell MPI
to use 4 cores
("physical"), and tell MPI to not use the hardware-threading option. The
"affinity" core number makes things even more complicated, since in the
Linux laptop example, it is equal to the logical core number. However,
on OSCAR, it is very different than the logical core number, and on
Macos, it is not present at all.

In `_determine_cores_hwthreading()`, if you read the lengthy comments, I
have thought through each common scenario, and I believe resolved what
to do for each, with respect to the number of cores to use and whether
or not to use hardware-threading. These scenarios include: the user
choosing to use hardware-threading (default) or not, across Macos
variations with and without hardware-threading, Linux local computer
variations with and without hardware-threading, and Linux
HPC (e.g. OSCAR) variations which appear to never support
hardware-threading. In the Windows case, due to both jonescompneurolab#589 and the
currently-untested MPI integration on Windows, I always report the
machine as not having hardware-threading.

Additionally, previously, if the user did provide a number for MPI
Processes, `MPIBackend` used some "heuristics" to decide whether to use
MPI oversubscription and/or hardware-threading, but the user could not
override these heuristics. Now, when a user instantiates an `MPIBackend`
with `__init__()` and uses the defaults, hardware-threading is detected
more robustly and enabled by default, and oversubscription is enabled
based on its own heuristics; this is the case when the new arguments
`hwthreading` and `oversubscribe` are set to their default value of
`None`. However, if the user knows what they're doing, they can also
pass either `True` or `False` to either of these options to force them
on or off. Furthermore, in the case of `hwthreading`, if the user
indicates they do not want to use it, then
`_determine_cores_hwthreading()` correctly returns the number of
NON-hardware-threaded cores for MPI's use, instead of the core number
including hardware-threads.

I have also modified and expanded the appropriate testing to compensate
for these changes.

Note that this does NOT change the default number of jobs to use for the
GUI if MPI is detected. Such a change breaks the current `test_gui.py`
testing: see jonescompneurolab#960
jonescompneurolab#960
asoplata committed Dec 13, 2024

Verified

This commit was signed with the committer’s verified signature.
Zorbatron Zorbatron
1 parent a67b026 commit 25c78c0
Showing 3 changed files with 368 additions and 91 deletions.
35 changes: 13 additions & 22 deletions hnn_core/gui/gui.py
Original file line number Diff line number Diff line change
@@ -8,8 +8,6 @@
import logging
import mimetypes
import numpy as np
import platform
import psutil
import sys
import json
import urllib.parse
@@ -36,7 +34,9 @@
get_L5Pyr_params_default)
from hnn_core.hnn_io import dict_to_network, write_network_configuration
from hnn_core.cells_default import _exp_g_at_dist
from hnn_core.parallel_backends import _has_mpi4py, _has_psutil
from hnn_core.parallel_backends import (_determine_cores_hwthreading,
_has_mpi4py,
_has_psutil)

hnn_core_root = Path(hnn_core.__file__).parent
default_network_configuration = (hnn_core_root / 'param' /
@@ -347,7 +347,10 @@ def __init__(self, theme_color="#802989",
self.params = self.load_parameters(network_configuration)

# Number of available cores
self.n_cores = self._available_cores()
[self.n_cores, _] = _determine_cores_hwthreading(
enable_hwthreading=False,
sensible_default_cores=True,
)

# In-memory storage of all simulation and visualization related data
self.simulation_data = defaultdict(lambda: dict(net=None, dpls=list()))
@@ -407,7 +410,8 @@ def __init__(self, theme_color="#802989",
self.widget_mpi_cmd = Text(value='mpiexec',
placeholder='Fill if applies',
description='MPI cmd:', disabled=False)
self.widget_n_jobs = BoundedIntText(value=1, min=1,
self.widget_n_jobs = BoundedIntText(value=1,
min=1,
max=self.n_cores,
description='Cores:',
disabled=False)
@@ -496,22 +500,6 @@ def __init__(self, theme_color="#802989",
self._init_ui_components()
self.add_logging_window_logger()

@staticmethod
def _available_cores():
"""Return the number of available cores to the process.
This is important for systems where the number of available cores is
partitioned such as on HPC systems. Linux and Windows can return cpu
affinity, which is the number of available cores. MacOS can only return
total system cores.
"""
# For macos
if platform.system() == 'Darwin':
return psutil.cpu_count()
# For Linux and Windows
else:
return len(psutil.Process().cpu_affinity())

@staticmethod
def _check_backend():
"""Checks for MPI and returns the default backend name"""
@@ -2108,7 +2096,10 @@ def run_button_clicked(widget_simulation_name, log_out, drive_widgets,
if backend_selection.value == "MPI":
backend = MPIBackend(
n_procs=n_jobs.value,
mpi_cmd=mpi_cmd.value)
mpi_cmd=mpi_cmd.value,
hwthreading=False,
oversubscribe=False,
)
else:
backend = JoblibBackend(n_jobs=n_jobs.value)
print(f"Using Joblib with {n_jobs.value} core(s).")
Loading

0 comments on commit 25c78c0

Please sign in to comment.