Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add pairwise! for Covariance types #5

Merged
merged 5 commits into from
Mar 26, 2024

Conversation

markmbaum
Copy link
Contributor

@markmbaum markmbaum commented Mar 24, 2024

This adds the pairwise! function for Covariance types. I also added a docstring for the variogram's pairwise! method.

closes JuliaEarth/GeoStats.jl#405

I did also notice that the existing implementation allocates a lot of memory even though it should just be filling a destination matrix. For example:

julia> grid = CartesianGrid((2, 3))
2×3 CartesianGrid{2,Float64}
  minimum: Point(0.0, 0.0)
  maximum: Point(2.0, 3.0)
  spacing: (1.0, 1.0)

julia> A = zeros(6, 6)
6×6 Matrix{Float64}:
 0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0

julia> γ = GaussianVariogram()
GaussianVariogram
├─ sill: 1.0
├─ nugget: 0.0
├─ range: 1.0
└─ distance: Euclidean

julia> @benchmark GeoStatsFunctions.pairwise!($A, $γ, $grid)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  429.125 μs    1.775 ms  ┊ GC (min  max): 0.00%  72.74%
 Time  (median):     438.125 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   462.660 μs ± 165.411 μs  ┊ GC (mean ± σ):  5.08% ± 10.09%

  █▃                                                            ▁
  ██▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇█▆▄▃▃▁▁▁▁▁▁▅ █
  429 μs        Histogram: log(frequency) by time       1.69 ms <

 Memory estimate: 948.28 KiB, allocs estimate: 7707.

I'm not exactly sure why and I didn't want to mess with the original implementation.

src/covariance.jl Outdated Show resolved Hide resolved
Copy link
Member

@juliohm juliohm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test with CartesianGrid allocates because it computes covariance between the elements of the domain, in this case quadrangles. In order to compute the covariance between two quadrangles, the variogram samples points inside the quadrangles based on the range, and performs numerical integration.

The variogram and covariance don't require allocation when they are evaluated with points. For example, if you construct PointSet(centroid.(grid)) as your domain, that will be much faster and shouldn't allocate.

Please add tests here as well for coverage purposes.

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 0% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 84.36%. Comparing base (da6d967) to head (ce937d2).

Files Patch % Lines
src/covariance.jl 0.00% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main       #5      +/-   ##
==========================================
- Coverage   85.03%   84.36%   -0.68%     
==========================================
  Files          26       26              
  Lines         628      633       +5     
==========================================
  Hits          534      534              
- Misses         94       99       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@markmbaum
Copy link
Contributor Author

markmbaum commented Mar 25, 2024

Should be all set. I added a test that will cover the new pairwise! method and changed the implementation to a loop.

This is still a little off topic, but I profiled pairwise! with a PointSet and still found some memory allocation.

julia> Γ = Matrix{Float64}(undef, 3, 3)
3×3 Matrix{Float64}:
 1.0e-323  4.4e-323  1.2e-322
 2.5e-323  9.4e-323  1.5e-322
 4.0e-323  1.1e-322  0.0

julia> c = GaussianCovariance()
GaussianCovariance
├─ sill: 1.0
├─ nugget: 0.0
├─ range: 1.0
└─ distance: Euclidean

julia> 𝒟 = PointSet(Matrix(1.0I, 3, 3))
3 PointSet{3,Float64}
├─ Point(1.0, 0.0, 0.0)
├─ Point(0.0, 1.0, 0.0)
└─ Point(0.0, 0.0, 1.0)

julia> ProfileView.@profview begin
         for i  1:1_000_000
           GeoStatsFunctions.pairwise!(Γ, c, 𝒟)
         end
       end

Which looks like it's coming from the variosample calls in the methods for Variogram types.

Screenshot 2024-03-24 at 6 55 50 PM

pairwise(cov, domain₁, domain₂)

Evaluate covariance `cov` between all elements of `domain₁` and `domain₂`.
"""
pairwise(cov::Covariance, args...) = sill(cov.γ) .- pairwise(cov.γ, args...)

"""
pairwise!(Γ, cov, domain)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We denote the covariance matrix by C or \Sigma. \Gamma is reserved for the variogram matrix.

@juliohm
Copy link
Member

juliohm commented Mar 25, 2024

Left a small comment regarding notation. Can you please replace \Gamma --> C or \Sigma in the covariance methods?

Also, it would be nice to investigate this allocation with Point in another PR. We probably need an extra method to avoid sampling the trivial geometry (i.e., point).

src/variogram.jl Outdated Show resolved Hide resolved
@juliohm
Copy link
Member

juliohm commented Mar 25, 2024

@markmbaum this PR has conflicts with the other PR that got merged.

@juliohm juliohm merged commit 7842431 into JuliaEarth:main Mar 26, 2024
6 checks passed
@markmbaum markmbaum deleted the covariance-pairwise branch March 26, 2024 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pairwise! with Covariance types
3 participants