Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a note about reinterpret's memory layout #199

Merged
merged 8 commits into from
Aug 27, 2021

Conversation

johnnychen94
Copy link
Member

I didn't add the StructArray{Point{Float64}}(X, dims=2) trick here because it is still quite strange to me; it has a strong assumption to how you interpret the data from the raw contents, but the confusion I get from #197 is how to get the "actual" memory layout from an already constructed StructArray.

closes #197

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@piever
Copy link
Collaborator

piever commented Aug 26, 2021

Thanks, I've added some changes to wording, and a possible couple of sentences to put at the beginning for context.

I didn't add the StructArray{Point{Float64}}(X, dims=2) trick here because it is still quite strange to me; it has a strong assumption to how you interpret the data from the raw contents

I actually think it should be mentioned in this section, because it is the easiest way to get a StructArray from a higher-dimensional array of primitive types. Some of the confusion, IMO, comes from the fact that, unlike Vector{ComplexF64}, in StructArray{ComplexF64} there is no single "block of memory" to speak of, but two separate vectors which will be at distinct locations. The only way to associate it to a single "block of memory" is to choose those vectors as contiguous views of an existing matrix.

As a meta comment, I think this section should go among the "Advanced" sections at the end of the README. reinterpret is a bit on the technical side in my opinion.

StructArrays also provides a way to reconstruct from a given memory block via `dims` keyword:

```julia
julia> v = Float64[1 3; 2 -1]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I use v = [1 3; 2 -1] then I get

julia> StructArray{ComplexF64}(v, dims=1)
0-element StructArray(StructArray(), StructArray()) with eltype ComplexF64 with indices 1:0

Is this expected?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? I get

julia> v = Float64[1 3; 2 -1]
2×2 Matrix{Float64}:
 1.0   3.0
 2.0  -1.0

julia> StructArray{ComplexF64}(v, dims=1)
2-element StructArray(view(::Matrix{Float64}, 1, :), view(::Matrix{Float64}, 2, :)) with eltype ComplexF64:
 1.0 + 2.0im
 3.0 - 1.0im

which is the expected behavior. Btw, dims=ndims(v) would be the way to get contiguous views as component arrays (selecting on the last dimension).

Copy link
Member Author

@johnnychen94 johnnychen94 Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry , I mean:

julia> v = Int[1 3; 2 -1]
2×2 Matrix{Int64}:
 1   3
 2  -1

julia> StructArray{ComplexF64}(v, dims=1)
0-element StructArray(StructArray(), StructArray()) with eltype ComplexF64 with indices 1:0

julia> StructArray{ComplexF64}(reinterpret(Float64, v), dims=1)
2-element StructArray(view(reinterpret(Float64, ::Matrix{Int64}), 1, :), view(reinterpret(Float64, ::Matrix{Int64}), 2, :)) with eltype ComplexF64:
 5.0e-324 + 1.0e-323im
 1.5e-323 + NaN*im

julia> v = ComplexF64[1 3; 2 -1]
2×2 Matrix{ComplexF64}:
 1.0+0.0im   3.0+0.0im
 2.0+0.0im  -1.0+0.0im

julia> StructArray{ComplexF64}(v, dims=1)
2-element view(::Matrix{ComplexF64}, 1, :) with eltype ComplexF64:
 1.0 + 0.0im
 3.0 + 0.0im

julia> StructArray{ComplexF64}(v, dims=2)
2-element view(::Matrix{ComplexF64}, :, 1) with eltype ComplexF64:
 1.0 + 0.0im
 2.0 + 0.0im

Interpreting the output is quite, hmmm, unintuitive. Maybe I just hit some undefined behaviors.

Copy link
Collaborator

@piever piever Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, it behaves a bit funny if the types don't match. It's probably because it has some logic to support nested cases, I've opened #200 to track this. Note that this constructor does not allocate, it can't use a matrix of integers to store components of ComplexF64.

README.md Outdated
1.0 3.0
2.0 -1.0

julia> StructArray{ComplexF64}(v, dims=1) # the actual memory is `([1.0, 3.0], [2.0, -1.0])`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, here and below the memory is always v, this constructor does not allocate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh thanks for pointing this out! I didn't realize this, I only did a quick check @btime StructArray{ComplexF64}(v, dims=1) with small v = Float64[1 3; 2 -1] and I thought the memory allocations means copy 😂

julia> @btime StructArray{ComplexF64}(v, dims=1);
521.754 ns (8 allocations: 352 bytes)

README.md Outdated
Comment on lines 410 to 412
This, however, depends on the underlying data layout and how you interpret the memory block. You
should use this with caution because otherwise it might give you unexpected results. To get the
"same" memory layout with the raw data `v`, you can always pass `dims=ndims(v)`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe best to just mention the actual caveat, that is that v must be typed correctly. Rather than memory layout (as the only in memory object is always v), using dims=ndims(v) is done to get the best performance, because then the components of the StructArray are contiguous views (i.e., things like @view v[:,1], which is the most efficient in column-major languages).

@johnnychen94
Copy link
Member Author

Getting a better understanding of this now. Thanks for the feedback!

Hope this is the last commit 😄

README.md Outdated Show resolved Hide resolved
@piever
Copy link
Collaborator

piever commented Aug 27, 2021

Great, I'm glad doing this was instructive!

@piever piever merged commit 8958925 into JuliaArrays:master Aug 27, 2021
@johnnychen94 johnnychen94 deleted the jc/reinterpret branch August 28, 2021 00:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

reinterpret might give wrong result?
2 participants