-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Polygon performance #109
Polygon performance #109
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good for the most part! I will test in a bit and let you know how the performance is :)
Co-authored-by: Anshul Singhvi <[email protected]>
Co-authored-by: Anshul Singhvi <[email protected]>
Co-authored-by: Anshul Singhvi <[email protected]>
# TODO add this to `Extents.union` for any `Tuple/AbstractArray` point | ||
function _union(extent::Extents.Extent, point::Tuple) | ||
X = min(extent.X[1], point[1]), max(extent.X[2], point[1]) | ||
Y = min(extent.Y[1], point[2]), max(extent.Y[2], point[2]) | ||
return Extents.Extent(; X, Y) | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
want to move this to rafaqz/Extents.jl#24 or maybe GeoInterface?
It would be cool to have Extents.union(extent, geometry)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I will, but not worth waiting for the release cycle for this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add it for AbstractArray
and NamedTuple
as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annoying we cant do it for all GeoInterface.jl points...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Would be good to add a test explicitly for the cached and uncached getgeom.
Ok its running all the test geometry checks twice. This should be good to go |
Good to have this 👍🏻. Might be good to comment on how you go about finding these optimizations?Similar work probably is required for GeoArrow (once there) and other packages when we want the most performance. |
How I found this was putting a Not very fancy... but "print summary metrics while slow things run" is a good motto I guess |
For
Shapefile.Polygon
this PR speeds up looping overGI.getgeom(poly, i)
by ~100x for the first run and ~1000x after that (for very large shape files at least!)Vector{Vector{Int}}
and only calculates it on the firstgetgeom
.Extents.intersects
before actually checking_inring
for another more modest speedup.It specifically does not eagerly fill the cache on construction, because some use cases don't need it, like
GI.getring
which returns them in the current order (which is 20x faster for e.g. calculating hulls or rasterization).Users may also only want a subset of polygons, and this skips doing most of the work in that case.
@asinghvi17 if you want to review that would be great :)
(there is a tiny bit of type instability but this will be fixed in Extents.jl its just lack of inlining stopping const prop)