Promote CityJSONFeatures for file storage #122
balazsdukai
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This is a proposal to promote the use of CityJSONFeatues for file storage.
Get the data
Download a tile from https://3d.kadaster.nl/basisvoorziening-3d/.
This experiment uses the tile
68dn2
,which contains a good variety of land uses.
The tile is split into four sub-tiles.
Each file is upgraded to CityJSON v1.1 and exported as CityJSON Lines with cjio 0.7.4.
File sizes
The file size comparison is shown in the table below.
As you can see, converting a regular CityJSON file to CityJSON Lines leads to 16-17% reduction in the file size, probably due to the smaller indices in the boundary arrays.
Loading the files
In this section I compared the speed and memory footprint of looping through each CityObject in a data set.
This operation is common in applications that manipulate a whole city model.
Basically almost all of cjio's operations.
However, the tests below are not relevant for CityJSON libraries that manipulate individual CityObject, because they typically want to store the whole city model in memory anyway.
When a city model is stored in its entirety in a CityJSON object, we need to load the whole CityJSON object into memory in order to access the
transform
andvertices
objects for instance.With CityJSONFeatures we can read the file line by line, processing and discarding the CityObjects one by one.
This allows a very efficient operation in terms of both CPU and memory usage, provided that the first object in the file is the CityJSON object that contains the metadata and
transform
property that is required for parsing the CityObjects.Operations that would highly benefit from this are subsetting, merging a citymodel.
Things like EPSG reassignment or metadata updates wouldn't even require to loop through the features, just to alter the first object in the file.
Execute the relevant script above for each file with
/usr/bin/time -v python3 load_cityjson.py 68dn2_01.json
and/usr/bin/time -v python3 load_cityjson_lines.py 68dn2_01.jsonl
.The results are summarized in the table below, where the decimals are discarded, because they don't make a difference in the comparison.
The results indicate that there is a very significant benefit to having CityJSONLines-files, compared to regular CityJSON files.
At least for the operations outlined above.
Proposal
transform
object first, before processing the features, then the whole efficient streaming logic breaks, because we need to keep everything in memory.Disadvantages
Beta Was this translation helpful? Give feedback.
All reactions