From a84ff9c126f4ff557c82aede7b5086e1101f1a4a Mon Sep 17 00:00:00 2001 From: Johnny Chen Date: Thu, 26 Aug 2021 00:48:41 +0800 Subject: [PATCH 1/8] add a note about `reinterpret`'s memory layout --- README.md | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/README.md b/README.md index bdceb6cc..9fde3cf3 100644 --- a/README.md +++ b/README.md @@ -130,6 +130,35 @@ julia> replace_storage(CuArray, s) 2.0 - 1.0im ``` +### Get the actual memory layout + +Using `reinterpret` on `StructArray` won't give you the real memory order as the reinterpretation works on an element-wise sense: + +```julia +julia> s = StructArray([1.0+im, 2.0-im]) +2-element StructArray(::Vector{Float64}, ::Vector{Float64}) with eltype ComplexF64: + 1.0 + 1.0im + 2.0 - 1.0im + +julia> reinterpret(Float64, s) # In memory this is actually stored in order [1.0, 2.0, 1.0, -1.0], assuming the tuples are contiguous. +4-element reinterpret(Float64, StructArray(::Vector{Float64}, ::Vector{Float64})): + 1.0 + 1.0 + 2.0 + -1.0 +``` + +If you already have `StructArray` created, the easiest way is to directly stack the components in memory order: + +```julia +julia> using StackViews # lazily cat/stack arrays in a new tailing dimension + +julia> StackView(StructArrays.components(s)...) +2×2 StackView{Float64, 2, 2, Tuple{Vector{Float64}, Vector{Float64}}}: + 1.0 1.0 + 2.0 -1.0 +``` + ## Example usage to store a data table ```julia From f30a9aba2512c72ae3257e663d64d653543b6fd3 Mon Sep 17 00:00:00 2001 From: Johnny Chen Date: Thu, 26 Aug 2021 00:53:39 +0800 Subject: [PATCH 2/8] rephrase the words --- README.md | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 9fde3cf3..228377cd 100644 --- a/README.md +++ b/README.md @@ -135,17 +135,15 @@ julia> replace_storage(CuArray, s) Using `reinterpret` on `StructArray` won't give you the real memory order as the reinterpretation works on an element-wise sense: ```julia -julia> s = StructArray([1.0+im, 2.0-im]) +julia> s = StructArray([1.0+3im, 2.0-im]) 2-element StructArray(::Vector{Float64}, ::Vector{Float64}) with eltype ComplexF64: 1.0 + 1.0im 2.0 - 1.0im -julia> reinterpret(Float64, s) # In memory this is actually stored in order [1.0, 2.0, 1.0, -1.0], assuming the tuples are contiguous. -4-element reinterpret(Float64, StructArray(::Vector{Float64}, ::Vector{Float64})): - 1.0 - 1.0 - 2.0 - -1.0 +julia> reinterpret(reshape, Float64, s) # The actuall memory is `[[1.0, 2.0], [3.0, -1.0]]` +2×2 reinterpret(reshape, Float64, StructArray(::Vector{Float64}, ::Vector{Float64})) with eltype Float64: + 1.0 2.0 + 3.0 -1.0 ``` If you already have `StructArray` created, the easiest way is to directly stack the components in memory order: @@ -155,7 +153,7 @@ julia> using StackViews # lazily cat/stack arrays in a new tailing dimension julia> StackView(StructArrays.components(s)...) 2×2 StackView{Float64, 2, 2, Tuple{Vector{Float64}, Vector{Float64}}}: - 1.0 1.0 + 1.0 3.0 2.0 -1.0 ``` From 377491801139909fa569b9f839f13fb99c633c11 Mon Sep 17 00:00:00 2001 From: Johnny Chen Date: Thu, 26 Aug 2021 17:51:14 +0800 Subject: [PATCH 3/8] apply suggestions --- README.md | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 228377cd..425736d3 100644 --- a/README.md +++ b/README.md @@ -130,9 +130,26 @@ julia> replace_storage(CuArray, s) 2.0 - 1.0im ``` -### Get the actual memory layout +### StructArrays versus struct-of-arrays layout in higher-dimensional array -Using `reinterpret` on `StructArray` won't give you the real memory order as the reinterpretation works on an element-wise sense: +Regular arrays of structs can sometimes be reinterpreted as arrays of primitive values with an added +initial dimension. + +```julia +julia> v = [1.0+3im, 2.0-im] +2-element Vector{ComplexF64}: + 1.0 + 3.0im + 2.0 - 1.0im + +julia> reinterpret(reshape, Float64, v) +2×2 reinterpret(reshape, Float64, ::Vector{ComplexF64}) with eltype Float64: + 1.0 2.0 + 3.0 -1.0 +``` + +However, the situation is more complex for the `StructArray` format, where `s = StructArray(v)` is +stored as two separate `Vector{Float64}`. `reinterpret` on `StructArray` returns an +"array-of-structs" layout, as the reinterpretation works element-wise: ```julia julia> s = StructArray([1.0+3im, 2.0-im]) @@ -140,13 +157,14 @@ julia> s = StructArray([1.0+3im, 2.0-im]) 1.0 + 1.0im 2.0 - 1.0im -julia> reinterpret(reshape, Float64, s) # The actuall memory is `[[1.0, 2.0], [3.0, -1.0]]` +julia> reinterpret(reshape, Float64, s) # The actual memory is `([1.0, 2.0], [3.0, -1.0])` 2×2 reinterpret(reshape, Float64, StructArray(::Vector{Float64}, ::Vector{Float64})) with eltype Float64: 1.0 2.0 3.0 -1.0 ``` -If you already have `StructArray` created, the easiest way is to directly stack the components in memory order: +If you already have a `StructArray`, the easiest way is to get the higher-dimensional +"struct-of-arrays" layout is to directly stack the components in memory order: ```julia julia> using StackViews # lazily cat/stack arrays in a new tailing dimension From bb5247d5e735597600cc6a9daff5003c6028ddab Mon Sep 17 00:00:00 2001 From: Johnny Chen Date: Thu, 26 Aug 2021 17:53:38 +0800 Subject: [PATCH 4/8] move to advanced section --- README.md | 90 +++++++++++++++++++++++++++---------------------------- 1 file changed, 45 insertions(+), 45 deletions(-) diff --git a/README.md b/README.md index 425736d3..208b6849 100644 --- a/README.md +++ b/README.md @@ -130,51 +130,6 @@ julia> replace_storage(CuArray, s) 2.0 - 1.0im ``` -### StructArrays versus struct-of-arrays layout in higher-dimensional array - -Regular arrays of structs can sometimes be reinterpreted as arrays of primitive values with an added -initial dimension. - -```julia -julia> v = [1.0+3im, 2.0-im] -2-element Vector{ComplexF64}: - 1.0 + 3.0im - 2.0 - 1.0im - -julia> reinterpret(reshape, Float64, v) -2×2 reinterpret(reshape, Float64, ::Vector{ComplexF64}) with eltype Float64: - 1.0 2.0 - 3.0 -1.0 -``` - -However, the situation is more complex for the `StructArray` format, where `s = StructArray(v)` is -stored as two separate `Vector{Float64}`. `reinterpret` on `StructArray` returns an -"array-of-structs" layout, as the reinterpretation works element-wise: - -```julia -julia> s = StructArray([1.0+3im, 2.0-im]) -2-element StructArray(::Vector{Float64}, ::Vector{Float64}) with eltype ComplexF64: - 1.0 + 1.0im - 2.0 - 1.0im - -julia> reinterpret(reshape, Float64, s) # The actual memory is `([1.0, 2.0], [3.0, -1.0])` -2×2 reinterpret(reshape, Float64, StructArray(::Vector{Float64}, ::Vector{Float64})) with eltype Float64: - 1.0 2.0 - 3.0 -1.0 -``` - -If you already have a `StructArray`, the easiest way is to get the higher-dimensional -"struct-of-arrays" layout is to directly stack the components in memory order: - -```julia -julia> using StackViews # lazily cat/stack arrays in a new tailing dimension - -julia> StackView(StructArrays.components(s)...) -2×2 StackView{Float64, 2, 2, Tuple{Vector{Float64}, Vector{Float64}}}: - 1.0 3.0 - 2.0 -1.0 -``` - ## Example usage to store a data table ```julia @@ -387,3 +342,48 @@ julia> s Foo(44, "d") Foo(55, "e") ``` + +## Advanced: StructArrays versus struct-of-arrays layout in higher-dimensional array + +Regular arrays of structs can sometimes be reinterpreted as arrays of primitive values with an added +initial dimension. + +```julia +julia> v = [1.0+3im, 2.0-im] +2-element Vector{ComplexF64}: + 1.0 + 3.0im + 2.0 - 1.0im + +julia> reinterpret(reshape, Float64, v) +2×2 reinterpret(reshape, Float64, ::Vector{ComplexF64}) with eltype Float64: + 1.0 2.0 + 3.0 -1.0 +``` + +However, the situation is more complex for the `StructArray` format, where `s = StructArray(v)` is +stored as two separate `Vector{Float64}`. `reinterpret` on `StructArray` returns an +"array-of-structs" layout, as the reinterpretation works element-wise: + +```julia +julia> s = StructArray([1.0+3im, 2.0-im]) +2-element StructArray(::Vector{Float64}, ::Vector{Float64}) with eltype ComplexF64: + 1.0 + 1.0im + 2.0 - 1.0im + +julia> reinterpret(reshape, Float64, s) # The actual memory is `([1.0, 2.0], [3.0, -1.0])` +2×2 reinterpret(reshape, Float64, StructArray(::Vector{Float64}, ::Vector{Float64})) with eltype Float64: + 1.0 2.0 + 3.0 -1.0 +``` + +If you already have a `StructArray`, the easiest way is to get the higher-dimensional +"struct-of-arrays" layout is to directly stack the components in memory order: + +```julia +julia> using StackViews # lazily cat/stack arrays in a new tailing dimension + +julia> StackView(StructArrays.components(s)...) +2×2 StackView{Float64, 2, 2, Tuple{Vector{Float64}, Vector{Float64}}}: + 1.0 3.0 + 2.0 -1.0 +``` From 75f7e1dfa4b5f62bbd1f680a2e24a2f4f7a66a7c Mon Sep 17 00:00:00 2001 From: Johnny Chen Date: Thu, 26 Aug 2021 18:07:06 +0800 Subject: [PATCH 5/8] add example for dims keyword --- README.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/README.md b/README.md index 208b6849..9a35ed52 100644 --- a/README.md +++ b/README.md @@ -387,3 +387,25 @@ julia> StackView(StructArrays.components(s)...) 1.0 3.0 2.0 -1.0 ``` + +StructArrays also provides a way to reconstruct from a given memory block via `dims` keyword: + +```julia +julia> v = Float64[1 3; 2 -1] +2×2 Matrix{Float64}: + 1.0 3.0 + 2.0 -1.0 + +julia> StructArray{ComplexF64}(v, dims=1) # the actual memory is `([1.0, 3.0], [2.0, -1.0])` +2-element StructArray(view(::Matrix{Float64}, 1, :), view(::Matrix{Float64}, 2, :)) with eltype ComplexF64: + 1.0 + 2.0im + 3.0 - 1.0im + +julia> s = StructArray{ComplexF64}(v, dims=2) # the actual memory is `([1.0, 2.0], [3.0, -1.0])` +2-element StructArray(view(::Matrix{Float64}, :, 1), view(::Matrix{Float64}, :, 2)) with eltype ComplexF64: + 1.0 + 3.0im + 2.0 - 1.0im +``` + +This, however, depends on the underlying data layout and how you interpret the memory block. You +should use this with caution because otherwise it might give you unexpected results. From ae4072740e1e4b894fd377cc26b8f91066a9a411 Mon Sep 17 00:00:00 2001 From: Johnny Chen Date: Thu, 26 Aug 2021 18:38:44 +0800 Subject: [PATCH 6/8] one more note on `dims=ndims(v)` --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 9a35ed52..e13ff1f5 100644 --- a/README.md +++ b/README.md @@ -408,4 +408,5 @@ julia> s = StructArray{ComplexF64}(v, dims=2) # the actual memory is `([1.0, 2.0 ``` This, however, depends on the underlying data layout and how you interpret the memory block. You -should use this with caution because otherwise it might give you unexpected results. +should use this with caution because otherwise it might give you unexpected results. To get the +"same" memory layout with the raw data `v`, you can always pass `dims=ndims(v)`. From 1469c8452d7fdeedc3c144f02a75f4e76bf5def8 Mon Sep 17 00:00:00 2001 From: Johnny Chen Date: Thu, 26 Aug 2021 22:10:22 +0800 Subject: [PATCH 7/8] explain the memory order and the view perspective --- README.md | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index e13ff1f5..41db779b 100644 --- a/README.md +++ b/README.md @@ -388,7 +388,8 @@ julia> StackView(StructArrays.components(s)...) 2.0 -1.0 ``` -StructArrays also provides a way to reconstruct from a given memory block via `dims` keyword: +StructArrays also provides `dims` keyword to reinterpret a given memory block without creating new +memory: ```julia julia> v = Float64[1 3; 2 -1] @@ -396,17 +397,27 @@ julia> v = Float64[1 3; 2 -1] 1.0 3.0 2.0 -1.0 -julia> StructArray{ComplexF64}(v, dims=1) # the actual memory is `([1.0, 3.0], [2.0, -1.0])` +julia> s = StructArray{ComplexF64}(v, dims=1) 2-element StructArray(view(::Matrix{Float64}, 1, :), view(::Matrix{Float64}, 2, :)) with eltype ComplexF64: 1.0 + 2.0im 3.0 - 1.0im -julia> s = StructArray{ComplexF64}(v, dims=2) # the actual memory is `([1.0, 2.0], [3.0, -1.0])` +julia> s = StructArray{ComplexF64}(v, dims=2) 2-element StructArray(view(::Matrix{Float64}, :, 1), view(::Matrix{Float64}, :, 2)) with eltype ComplexF64: 1.0 + 3.0im 2.0 - 1.0im + +julia> s[1] = 0+0im; s # `s` is a reinterpretation view and doesn't copy memory +2-element StructArray(view(::Matrix{Float64}, :, 1), view(::Matrix{Float64}, :, 2)) with eltype ComplexF64: + 0.0 + 0.0im + 2.0 - 1.0im + +julia> v # thus `v` will be modified as well +2×2 Matrix{Float64}: + 0.0 0.0 + 2.0 -1.0 ``` -This, however, depends on the underlying data layout and how you interpret the memory block. You -should use this with caution because otherwise it might give you unexpected results. To get the -"same" memory layout with the raw data `v`, you can always pass `dims=ndims(v)`. +For column-major arrays, reinterpreting along the last dimension (`dims=ndims(v)`) makes every +component of `s` reflects a contiguous memory and thus will be more efficient. In previous example, +when `dims=2` we have `s.re == [1.0, 2.0]`, which reflects the first column of `v`. From 7856ded3e5d4b5e024ab1278a08a8e3e83409ab7 Mon Sep 17 00:00:00 2001 From: Pietro Vertechi Date: Fri, 27 Aug 2021 17:26:02 +0200 Subject: [PATCH 8/8] Minor wording change --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 41db779b..65aeefe9 100644 --- a/README.md +++ b/README.md @@ -419,5 +419,5 @@ julia> v # thus `v` will be modified as well ``` For column-major arrays, reinterpreting along the last dimension (`dims=ndims(v)`) makes every -component of `s` reflects a contiguous memory and thus will be more efficient. In previous example, +component of `s` a view of contiguous memory and thus is more efficient. In the previous example, when `dims=2` we have `s.re == [1.0, 2.0]`, which reflects the first column of `v`.