Replies: 2 comments
-
Your intuition what should be slow and what shouldn't is off here. With that much memory accessing the flat nodes file is blazingly fast, because it will be cached in RAM completely by the OS. The non-slim data structures on the other hand are optimized to use less memory, basically storing the data in some kind of compressed format. And it takes more time to access that than doing that lookup in the flat nodes file. So this is all as expected. |
Beta Was this translation helpful? Give feedback.
-
Ok, thanks for the explanation, especially regarding the compressed storage format of non-slim imported middle data. Funny thing is, and this is of course coincidence, but with my current hardware configuration, and after the commit #2007 that seems to have made the import of relations 25% faster in --slim mode, and with Planet's current ratio of ways:relations of approximately 100:1, I now end up with nearly equivalent import times for slim + flat-nodes on the one hand, and non-slim on the other. The slightly slower import of ways in non-slim import, is compensated by much faster relation processing, and also slightly faster node import. Both modes of importing are now approximately 8 hours for a full Planet import using a custom openstreetmap-carto derived style. |
Beta Was this translation helpful? Give feedback.
-
What version of osm2pgsql are you using?
What operating system and PostgreSQL/PostGIS version are you using?
Postgres version: 15.3 (Ubuntu 15.3-1.pgdg22.04+1)
PostGIS version: POSTGIS="3.3.3 2355e8e" [EXTENSION] PGSQL="150" GEOS="3.10.2-CAPI-1.16.0" PROJ="8.2.1" LIBXML="2.9.13" LIBJSON="0.15" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)"
Tell us something about your system
Bare metal 512 GB RAM, 2x Intel Xeon E5-2699 v4
What did you do exactly?
I have been running multiple tests with both --slim and non-slim and Planet size imports in the past couple of days after upgrading to the latest master on a machine with enough RAM to hold all data in RAM if choosing to import in a non-slim mode with --flat-nodes specified.
One thing I have noticed now is that consistently, the way loading in non-slim mode is about 15% slower than with --slim and --flat-nodes, about 70k/s for non-slim versus 85k/s for --slim and --flat-nodes.
I didn't expect this, as all data is stored in RAM with non-slim mode. Both nodes and especially relations do load (much) faster (for relations I see an about 2.5x times speed increase). Admittedly, the --flat-nodes file is stored on a very capable NVMe RAID 0, but even so, I would assume data in RAM to be faster accessible.
Is there any plausible explanation for this difference and the slower loading of ways in non-slim mode?
What did you expect to happen?
Loading ways using non-slim mode is as fast or faster than with --slim.
What did happen instead?
Loading ways using non-slim mode is about 15% slower than with --slim.
What did you do to try analyzing the problem?
Run multiple import sessions in both --slim and non-slim mode to verify the witnessed speed difference was consistent. It was.
Beta Was this translation helpful? Give feedback.
All reactions