Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to output GeoArrow-encoded parquet from geopandas #799

Open
keller-mark opened this issue Dec 3, 2024 · 1 comment
Open

Option to output GeoArrow-encoded parquet from geopandas #799

keller-mark opened this issue Dec 3, 2024 · 1 comment

Comments

@keller-mark
Copy link

keller-mark commented Dec 3, 2024

Is your feature request related to a problem? Please describe.
Currently, spatialdata stores Shape elements on-disk using WKB-encoded parquet files, since this is the GeoPandas default of geopandas.to_parquet - see geometry_encoding parameter (code).

WKB encoding has a high overhead for decoding/deserialization. For background:

Describe the solution you'd like
Would it be possible to use to_parquet(geometry_encoding='geoarrow') as the default

Describe alternatives you've considered
Alternatively, could allow users to opt-in to using GeoArrow encoding (as opposed to WKB).

Additional context
Would need to document that the on-disk representation may use either wkb or geoarrow encoding (or only one of the two) and how to detect which was used.

@LucaMarconato
Copy link
Member

Thanks for reporting this. Do you have a estimate of the performance improvement that you would observed? We should also perform some benchmarks on our side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants