diff --git a/docs/pages/parallel-async.md b/docs/pages/parallel-async.md index 776368d8..ebc13b9b 100644 --- a/docs/pages/parallel-async.md +++ b/docs/pages/parallel-async.md @@ -5,8 +5,8 @@ layout: default # Parallel and asynchronous processing -Python has a good ecosystem of libraries for parallelising the processing of tasks, -as well as asynchronous processing. +Python has a good ecosystem of libraries for parallelising the processing of +tasks, as well as asynchronous processing. Parallelisation in Python is typically _process-based_ with code parallelised across multiple Python processes each with their own interpreter or makes use of @@ -21,13 +21,14 @@ simply due to pre-existing code using a library like [pandas]. ## Process-based (and thread-based) parallelism -| Name | Short description | 🚦 | -| ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-: | -| [multiprocess] | A fork of [multiprocessing] which uses `dill` instead of `pickle` to allow serializing wider range of object types including nested / anonymous functions. We've found this easier to use than `multiprocessing`. | 🟢 | -| [dask] | Aims to make scaling existing code in familiar libraries (`numpy`, [pandas], `scikit-learn`, ...) easy. | 🟠 | -| [multiprocessing] | The standard library module for distributing tasks across multiple processes. | 🟠 | -| [mpi4py] | Support for MPI based parallelism. | 🟠 | -| [threading] | The standard library module for multi-threading. Due to the _global interpreter lock_ [currently][PEP703] only one thread can execute Python code at a time. | 🔴 | +| Name | Short description | 🚦 | +| -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-: | +| [multiprocess] | A fork of [multiprocessing] which uses `dill` instead of `pickle` to allow serializing wider range of object types including nested / anonymous functions. We've found this easier to use than `multiprocessing`. | 🟢 | +| [concurrent.futures] | [See the table below](#asynchronous-processing). | 🟠 | +| [dask] | Aims to make scaling existing code in familiar libraries (`numpy`, [pandas], `scikit-learn`, ...) easy. | 🟠 | +| [multiprocessing] | The standard library module for distributing tasks across multiple processes. | 🟠 | +| [mpi4py] | Support for MPI based parallelism. | 🟠 | +| [threading] | The standard library module for multi-threading. Due to the _global interpreter lock_ [currently][PEP703] only one thread can execute Python code at a time. | 🔴 | ## Compiler-based parallelism @@ -37,6 +38,19 @@ simply due to pre-existing code using a library like [pandas]. | [numba] | [Support for parallelism via `jit(parallel=True)`](https://numba.pydata.org/numba-doc/latest/user/parallel.html). | 🟠 | | [jax] | [Support for parallelising NumPy / scientific computing like operations using functional transforms](https://jax.readthedocs.io/en/latest/jax-101/06-parallelism.html). | 🟠 | +## Asynchronous processing + +| Name | Short description | 🚦 | +| -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-: | +| [asyncio] | Python standard library for asynchronous programming with tasks run in a single-threaded event loop. Used for [cooperative multitasking](https://en.wikipedia.org/wiki/Cooperative_multitasking). | 🟠 | +| [concurrent.futures] | Another Python standard library for asynchrounous processing. Provides a common interface for thread and process based concurrency as an alternative to using `multiprocess(ing)` or `threading` directly. | 🟠 | + +## See also + +- This [Stack Overflow post](https://stackoverflow.com/a/61360215) is a nice + summary of what each of [threading], [multiprocessing], [asyncio] and + [concurrent.futures] do. + [multiprocess]: https://multiprocess.readthedocs.io/en/stable/ @@ -49,3 +63,5 @@ simply due to pre-existing code using a library like [pandas]. [dask]: https://docs.dask.org/ [numba]: https://numba.pydata.org/ [jax]: https://jax.readthedocs.io/ +[asyncio]: https://docs.python.org/3/library/asyncio.html +[concurrent.futures]: https://docs.python.org/3/library/concurrent.futures.html