Skip to content

Releases: omlins/ParallelStencil.jl

ParallelStencil.jl 0.11.1

16 Feb 16:40
62eff06
Compare
Choose a tag to compare

Release notes

  • Use extension for Enzyme dependency (#139, #140)
  • Add support for CellArrays 0.2 (#141, #142)

ParallelStencil.jl 0.11.0

08 Jan 19:31
999f0e5
Compare
Choose a tag to compare

Release notes

  • Change scope of ParallelStencil initialization parameters to module scope: @init_parallel_stencil must newly be called once per module - right after using ParallelStencil is recommended (CHECK COMPATIBILITY!) (#130)
  • Enable dimension-agnostic kernel generation thanks to new ndims tuple expansion and type parameter substitution keyword N (#134, #136)
  • Enable architecture-agnostic creation of indices arrays with the allocator macros (#137)
  • Fix type instability when using type aliases directly (#135)

ParallelStencil.jl 0.10.1

13 Dec 19:01
ec9e737
Compare
Choose a tag to compare

Release notes

  • Add @d2_xa and @d2_ya macros for 2D (#131)

ParallelStencil.jl 0.10.0

30 Nov 18:59
6955cb7
Compare
Choose a tag to compare

Release notes

  • Enable automatic ranges detection for any bitstype variables and nested (named) tuples containing any supported type (#121)
  • Add tuple type aliases and NamedTuple constructors in the Data module (#121)
  • Allow complex arguments for compute macros (expanding arguments) (#121)
  • Add support for AMDGPU 0.8 (#123)
  • Add support for CUDA 5 (#129)

ParallelStencil.jl 0.9.0

13 Sep 12:37
7d93153
Compare
Choose a tag to compare

Release notes

  • Add support for automatic addition of @inbounds in kernels by keyword (inbounds) and make global default settable in @init_parallel_stencil (#119)
  • Enable usage of Structs, NamedTuples and other expressions in @parallel ∇ calls (#114).
  • Enable writing dimension-agnostic @parallel_indices kernels by supporting parallel indices generation as a tuple (syntax: @parallel_indices (I...) <kernel>) (#116)
  • Add support for direct usage of FiniteDifferences{1|2|3}D macros and any other macros programmed with ParallelStencil.INDICES in @parallel_indices kernels (#111)
  • Allow for return statements in nested functions within @parallel_indices kernels (#117, #118)
  • Enable multiple computation calls in @hide_communication and explicit ranges for computation calls (#115)
  • Remove initialization-related warnings in interactive usage (#113)
  • Fix 2D thermo-mechanics miniapp (#74)

ParallelStencil.jl 0.8.2

25 Aug 09:31
9b49850
Compare
Choose a tag to compare

Release notes

  • Fix stream synchronization for AMDGPU backend (#109)

ParallelStencil.jl 0.8.1

22 Jul 16:18
4a23081
Compare
Choose a tag to compare

Release notes

  • Add AMDGPU v0.5 support (#107)

ParallelStencil.jl 0.8.0

20 Jul 08:28
a76bb75
Compare
Choose a tag to compare

Release notes

  • Add high-level support for architecture-agnostic Enzyme-powered automatic differentiation (#101)
  • Make generic Enzyme-based automatic differentiation of ParallelStencil kernels more convenient (#99)

ParallelStencil.jl 0.7.1

05 Jul 16:46
8657681
Compare
Choose a tag to compare

Release notes

  • Make shared memory allocation robust for compilation throughout all CUDA/AMDGPU versions (#98)

ParallelStencil.jl 0.7.0

15 Jun 17:56
bda7ee2
Compare
Choose a tag to compare

Release notes

  • Add keyword memopt to @parallel and @parallel_indices, exposing generalized optimization of fast memory usage (of registers and shared memory) (#81, #94 )
  • Add support for AMDGPU (#69, #81 , #93, #95 )
  • Add support for arrays of small arrays/structs leveraging CellArrays via keywords in the hardware-agnostic allocators (#54, #95)
  • Add @fill, @falses and trues allocators (#54)
  • Enable allocation with enums using @fill and @rand (#62)
  • Support numbertype omission in ParallelStencil initialization (#47)
  • Add macro to compute harmonic averages (#57)
  • Add documentation for memopt optimization, CellArrays and AMDGPU (#97)
  • Add support for CUDA v4 (#81)
  • Add support for Julia 1.9 (#81)