-
Notifications
You must be signed in to change notification settings - Fork 14
/
Copy pathCHANGES
82 lines (60 loc) · 3.46 KB
/
CHANGES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
======================================================================
Changes in 2.1a1
======================================================================
- Allow user to specify the GPU awareness in underlying MPI implementation
via environment variable OSHMPI_MPI_GPU_FEATURES ("pt2pt,put,get,acc",
default all).
- Support active message based RMA (i.e., using MPI PT2PT). Disable by default.
- Support memory space with CUDA memory kind. New APIs include:
void shmemx_space_create(shmemx_space_config_t space_config, shmemx_space_t * space)
void shmemx_space_destroy(shmemx_space_t space)
int shmemx_space_create_ctx(shmemx_space_t space, long options, shmem_ctx_t * ctx)
void shmemx_space_attach(shmemx_space_t space)
void shmemx_space_detach(shmemx_space_t space)
void *shmemx_space_malloc(shmemx_space_t space, size_t size)
void *shmemx_space_calloc(shmemx_space_t space, size_t count, size_t size)
void *shmemx_space_align(shmemx_space_t space, size_t alignment, size_t size)
Examples to use these APIs can be found at the built-in test suite:
tests/space.c
tests/space_int_amo.c
tests/space_int_put.c
tests/space_ctx_int_put.c
- Enable MPI dynamic window based optimization. See configure option
--enable-win-type=dynamic_win. Recommended to use with MPICH/OFI version
3.4b1 or newer.
- Allow --enable-fast=ipo to force inline performance critical routines.
======================================================================
Changes in 2.0rc
======================================================================
- Do not inline performance noncritical paths such as init, finalize,
and memory management functions. It allows faster build and less
instruction cache misses at runtime.
- Enable operation tracking to avoid unnecessary network flush. See
configure option --enable-op-tracking.
- Enable datatype cache to reduce MPI derived datatype creation overhead at
strided operations. See configure option --enable-strided-cache.
- Disable internal assertion when --enable-fast configure option is set.
- Update openpa git repository URL.
- Bug fixes:
+ Unsupported MPI datatype with complexd reduce calls.
+ Compile openpa only when multithread safety is set.
======================================================================
Changes in 2.0b1
======================================================================
- Full support of OpenSHMEM 1.4 specification.
- Function inline for all OSHMPI internal routines.
- Caching internal MPI communicators for collective operations with PE
active set.
- OpenSHMEM multithreading safety support. Internal POSIX mutex-based
per-object critical sections are enabled to ensure thread-safety when
using OSHMPI with multithreaded OpenSHMEM program (SHMEM_THREAD_MULTIPLE).
See --enable-threads configure option.
- Active message support of OpenSHMEM atomic operations. MPI accumulates
cannot be directly used in OpenSHMEM atomics because MPI does not
guarantee atomicity between different reduce operations (e.g., add and cswap).
Therefore, the active message based method is used by default. An MPI
accumulates based version can be enabled by setting the configure option
--enable-amo=direct, or setting the OSHMPI_AMO_OPS environment variable
at runtime (See README).
- Asynchronous progress thread for active message based atomic operations.
Disabled by default. See --enable-async-thread configure option.