Releases: omlins/ParallelStencil.jl
Releases · omlins/ParallelStencil.jl
ParallelStencil.jl 0.11.1
ParallelStencil.jl 0.11.0
Release notes
- Change scope of ParallelStencil initialization parameters to module scope:
@init_parallel_stencil
must newly be called once per module - right afterusing ParallelStencil
is recommended (CHECK COMPATIBILITY!) (#130) - Enable dimension-agnostic kernel generation thanks to new
ndims
tuple expansion and type parameter substitution keywordN
(#134, #136) - Enable architecture-agnostic creation of indices arrays with the allocator macros (#137)
- Fix type instability when using type aliases directly (#135)
ParallelStencil.jl 0.10.1
Release notes
- Add
@d2_xa
and@d2_ya
macros for 2D (#131)
ParallelStencil.jl 0.10.0
Release notes
- Enable automatic ranges detection for any bitstype variables and nested (named) tuples containing any supported type (#121)
- Add tuple type aliases and NamedTuple constructors in the Data module (#121)
- Allow complex arguments for compute macros (expanding arguments) (#121)
- Add support for AMDGPU 0.8 (#123)
- Add support for CUDA 5 (#129)
ParallelStencil.jl 0.9.0
Release notes
- Add support for automatic addition of
@inbounds
in kernels by keyword (inbounds
) and make global default settable in@init_parallel_stencil
(#119) - Enable usage of Structs, NamedTuples and other expressions in
@parallel ∇
calls (#114). - Enable writing dimension-agnostic
@parallel_indices
kernels by supporting parallel indices generation as a tuple (syntax:@parallel_indices (I...) <kernel>
) (#116) - Add support for direct usage of
FiniteDifferences{1|2|3}D
macros and any other macros programmed withParallelStencil.INDICES
in@parallel_indices
kernels (#111) - Allow for return statements in nested functions within
@parallel_indices
kernels (#117, #118) - Enable multiple computation calls in
@hide_communication
and explicit ranges for computation calls (#115) - Remove initialization-related warnings in interactive usage (#113)
- Fix 2D thermo-mechanics miniapp (#74)
ParallelStencil.jl 0.8.2
Release notes
- Fix stream synchronization for AMDGPU backend (#109)
ParallelStencil.jl 0.8.1
Release notes
- Add AMDGPU v0.5 support (#107)
ParallelStencil.jl 0.8.0
ParallelStencil.jl 0.7.1
Release notes
- Make shared memory allocation robust for compilation throughout all CUDA/AMDGPU versions (#98)
ParallelStencil.jl 0.7.0
Release notes
- Add keyword
memopt
to@parallel
and@parallel_indices
, exposing generalized optimization of fast memory usage (of registers and shared memory) (#81, #94 ) - Add support for AMDGPU (#69, #81 , #93, #95 )
- Add support for arrays of small arrays/structs leveraging CellArrays via keywords in the hardware-agnostic allocators (#54, #95)
- Add
@fill
,@falses
andtrues
allocators (#54) - Enable allocation with enums using
@fill
and@rand
(#62) - Support numbertype omission in ParallelStencil initialization (#47)
- Add macro to compute harmonic averages (#57)
- Add documentation for
memopt
optimization, CellArrays and AMDGPU (#97) - Add support for CUDA v4 (#81)
- Add support for Julia 1.9 (#81)