Skip to content

Commit

Permalink
gh-84559: Change the multiprocessing start method default to `forkser…
Browse files Browse the repository at this point in the history
…ver` (GH-101556)

Change the default multiprocessing start method away from fork to forkserver or spawn on the remaining platforms where it was fork.  See the issue for context.  This makes the default far more thread safe (other than for people spawning threads at import time... - don't do that!).

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Hugo van Kemenade <[email protected]>
  • Loading branch information
3 people authored Sep 26, 2024
1 parent 83e5dc0 commit b65f2cd
Show file tree
Hide file tree
Showing 7 changed files with 75 additions and 33 deletions.
14 changes: 6 additions & 8 deletions Doc/library/concurrent.futures.rst
Original file line number Diff line number Diff line change
Expand Up @@ -286,14 +286,6 @@ to a :class:`ProcessPoolExecutor` will result in deadlock.

Added the *initializer* and *initargs* arguments.

.. note::
The default :mod:`multiprocessing` start method
(see :ref:`multiprocessing-start-methods`) will change away from
*fork* in Python 3.14. Code that requires *fork* be used for their
:class:`ProcessPoolExecutor` should explicitly specify that by
passing a ``mp_context=multiprocessing.get_context("fork")``
parameter.

.. versionchanged:: 3.11
The *max_tasks_per_child* argument was added to allow users to
control the lifetime of workers in the pool.
Expand All @@ -310,6 +302,12 @@ to a :class:`ProcessPoolExecutor` will result in deadlock.
*max_workers* uses :func:`os.process_cpu_count` by default, instead of
:func:`os.cpu_count`.

.. versionchanged:: 3.14
The default process start method (see
:ref:`multiprocessing-start-methods`) changed away from *fork*. If you
require the *fork* start method for :class:`ProcessPoolExecutor` you must
explicitly pass ``mp_context=multiprocessing.get_context("fork")``.

.. _processpoolexecutor-example:

ProcessPoolExecutor Example
Expand Down
21 changes: 15 additions & 6 deletions Doc/library/multiprocessing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -124,11 +124,11 @@ to start a process. These *start methods* are
inherited by the child process. Note that safely forking a
multithreaded process is problematic.

Available on POSIX systems. Currently the default on POSIX except macOS.
Available on POSIX systems.

.. note::
The default start method will change away from *fork* in Python 3.14.
Code that requires *fork* should explicitly specify that via
.. versionchanged:: 3.14
This is no longer the default start method on any platform.
Code that requires *fork* must explicitly specify that via
:func:`get_context` or :func:`set_start_method`.

.. versionchanged:: 3.12
Expand All @@ -146,9 +146,11 @@ to start a process. These *start methods* are
side-effect so it is generally safe for it to use :func:`os.fork`.
No unnecessary resources are inherited.

Available on POSIX platforms which support passing file descriptors
over Unix pipes such as Linux.
Available on POSIX platforms which support passing file descriptors over
Unix pipes such as Linux. The default on those.

.. versionchanged:: 3.14
This became the default start method on POSIX platforms.

.. versionchanged:: 3.4
*spawn* added on all POSIX platforms, and *forkserver* added for
Expand All @@ -162,6 +164,13 @@ to start a process. These *start methods* are
method should be considered unsafe as it can lead to crashes of the
subprocess as macOS system libraries may start threads. See :issue:`33725`.

.. versionchanged:: 3.14

On POSIX platforms the default start method was changed from *fork* to
*forkserver* to retain the performance but avoid common multithreaded
process incompatibilities. See :gh:`84559`.


On POSIX using the *spawn* or *forkserver* start methods will also
start a *resource tracker* process which tracks the unlinked named
system resources (such as named semaphores or
Expand Down
8 changes: 8 additions & 0 deletions Doc/whatsnew/3.14.rst
Original file line number Diff line number Diff line change
Expand Up @@ -385,6 +385,14 @@ Deprecated
as a single positional argument.
(Contributed by Serhiy Storchaka in :gh:`109218`.)

* :mod:`multiprocessing` and :mod:`concurrent.futures`:
The default start method (see :ref:`multiprocessing-start-methods`) changed
away from *fork* to *forkserver* on platforms where it was not already
*spawn* (Windows & macOS). If you require the threading incompatible *fork*
start method you must explicitly specify it when using :mod:`multiprocessing`
or :mod:`concurrent.futures` APIs.
(Contributed by Gregory P. Smith in :gh:`84559`.)

* :mod:`os`:
:term:`Soft deprecate <soft deprecated>` :func:`os.popen` and
:func:`os.spawn* <os.spawnl>` functions. They should no longer be used to
Expand Down
26 changes: 13 additions & 13 deletions Lib/multiprocessing/context.py
Original file line number Diff line number Diff line change
Expand Up @@ -259,13 +259,12 @@ def get_start_method(self, allow_none=False):

def get_all_start_methods(self):
"""Returns a list of the supported start methods, default first."""
if sys.platform == 'win32':
return ['spawn']
else:
methods = ['spawn', 'fork'] if sys.platform == 'darwin' else ['fork', 'spawn']
if reduction.HAVE_SEND_HANDLE:
methods.append('forkserver')
return methods
default = self._default_context.get_start_method()
start_method_names = [default]
start_method_names.extend(
name for name in _concrete_contexts if name != default
)
return start_method_names


#
Expand Down Expand Up @@ -320,14 +319,15 @@ def _check_available(self):
'spawn': SpawnContext(),
'forkserver': ForkServerContext(),
}
if sys.platform == 'darwin':
# bpo-33725: running arbitrary code after fork() is no longer reliable
# on macOS since macOS 10.14 (Mojave). Use spawn by default instead.
_default_context = DefaultContext(_concrete_contexts['spawn'])
# bpo-33725: running arbitrary code after fork() is no longer reliable
# on macOS since macOS 10.14 (Mojave). Use spawn by default instead.
# gh-84559: We changed everyones default to a thread safeish one in 3.14.
if reduction.HAVE_SEND_HANDLE and sys.platform != 'darwin':
_default_context = DefaultContext(_concrete_contexts['forkserver'])
else:
_default_context = DefaultContext(_concrete_contexts['fork'])
_default_context = DefaultContext(_concrete_contexts['spawn'])

else:
else: # Windows

class SpawnProcess(process.BaseProcess):
_start_method = 'spawn'
Expand Down
24 changes: 19 additions & 5 deletions Lib/test/_test_multiprocessing.py
Original file line number Diff line number Diff line change
Expand Up @@ -5553,15 +5553,29 @@ def test_set_get(self):
multiprocessing.set_start_method(old_method, force=True)
self.assertGreaterEqual(count, 1)

def test_get_all(self):
def test_get_all_start_methods(self):
methods = multiprocessing.get_all_start_methods()
self.assertIn('spawn', methods)
if sys.platform == 'win32':
self.assertEqual(methods, ['spawn'])
elif sys.platform == 'darwin':
self.assertEqual(methods[0], 'spawn') # The default is first.
# Whether these work or not, they remain available on macOS.
self.assertIn('fork', methods)
self.assertIn('forkserver', methods)
else:
self.assertTrue(methods == ['fork', 'spawn'] or
methods == ['spawn', 'fork'] or
methods == ['fork', 'spawn', 'forkserver'] or
methods == ['spawn', 'fork', 'forkserver'])
# POSIX
self.assertIn('fork', methods)
if other_methods := set(methods) - {'fork', 'spawn'}:
# If there are more than those two, forkserver must be one.
self.assertEqual({'forkserver'}, other_methods)
# The default is the first method in the list.
self.assertIn(methods[0], {'forkserver', 'spawn'},
msg='3.14+ default must not be fork')
if methods[0] == 'spawn':
# Confirm that the current default selection logic prefers
# forkserver vs spawn when available.
self.assertNotIn('forkserver', methods)

def test_preload_resources(self):
if multiprocessing.get_start_method() != 'forkserver':
Expand Down
10 changes: 9 additions & 1 deletion Lib/test/support/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2209,7 +2209,15 @@ def skip_if_broken_multiprocessing_synchronize():
# bpo-38377: On Linux, creating a semaphore fails with OSError
# if the current user does not have the permission to create
# a file in /dev/shm/ directory.
synchronize.Lock(ctx=None)
import multiprocessing
synchronize.Lock(ctx=multiprocessing.get_context('fork'))
# The explicit fork mp context is required in order for
# TestResourceTracker.test_resource_tracker_reused to work.
# synchronize creates a new multiprocessing.resource_tracker
# process at module import time via the above call in that
# scenario. Awkward. This enables gh-84559. No code involved
# should have threads at that point so fork() should be safe.

except OSError as exc:
raise unittest.SkipTest(f"broken multiprocessing SemLock: {exc!r}")

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
The default :mod:`multiprocessing` start method on Linux and other POSIX
systems has been changed away from often unsafe ``"fork"`` to ``"forkserver"``
(when the platform supports sending file handles over pipes as most do) or
``"spawn"``. Mac and Windows are unchanged as they already default to
``"spawn"``.

0 comments on commit b65f2cd

Please sign in to comment.