Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to *.gz data at import #519

Merged
merged 1 commit into from
Mar 25, 2024
Merged

Conversation

Joxit
Copy link
Member

@Joxit Joxit commented Mar 21, 2024

Hi there, I added a new feature for OpenAddresses importer.

OA now uses the GeoJSON format which is very verbose and therefore much heavier than CSV. It is increasingly complicated to manage the disk space that can be optimized.
So, for space efficiency I thought it would be nice to store only gzip versions on disk and import them. This may use more CPU but it will be the user's choice.

With this PR we are now able to import both raw and gzipped CSV and GeoJSON.

Example for French countrywide addresses from latest build and 2020 CSV build.

File Number of entries Raw Size Gzip Size
fr/countrywide-addresses-country.geojson 26M 6.6G 688M
fr/countrywide.csv 25M 2.4G 584M

@missinglink missinglink merged commit 06f4e29 into master Mar 25, 2024
7 checks passed
@missinglink missinglink deleted the joxit/feat/import-gzip-data branch March 25, 2024 13:55
@missinglink
Copy link
Member

Looks good, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants