Releases: osm2pgsql-dev/osm2pgsql
Release 2.0.0
The day has finally arrived where we are releasing version 2.0.0. of osm2pgsql, five years after the release of 1.0.0.
This release marks a milestone in modernizing osm2pgsql. We removed a lot of accumulated cruft from more than a decade of development. This makes osm2pgsql easier to understand for users and for developers. And it has allowed us to solve some long-standing issues and will allow further improvements in the future.
Major breaking changes:
- The legacy format for the middle tables has been removed as well as the old non-bucket way node index.
- The gazetteer output (used by Nominatim) has been removed.
- Several command line options have been removed and others are checked more strictly.
- There are some new library requirements and Lua is not optional any more.
- The
add_row()
function in the flex Lua config has been removed, useinsert()
instead. - Check that Lua functions on OSM object are called correctly using the colon-syntax.
- Handling of untagged objects and object attributes has changed.
Please see the Upgrading appendix for the details of these changes. We suggest you update to version 1.11.0 first and resolve any issues you see there before upgrading to version 2.0.0.
This is the first release that deprecates the "pgsql" output, please start moving towards the "flex" output instead. See this FAQ entry for the details.
New features in the flex output:
- Two-stage processing now also supports node members of relations, not only way members
- Optionally build id index as unique index
- Allow setting the names of indexes in the database
- New after_nodes|ways|relations() processing functions in flex Lua config files
- Make osm2pgsql properties available in Lua in osm2pgsql.properties table
- Add get_bbox() function to geometries in Lua
- Allow empty config file in flex output, useful for some corner uses
Changes in the generalization code:
- Add tile-based generalizer calling SQL commands
- Fix: Do not run ANALYZE in append mode, autovacuum will do that for us
- Fix: Handle errors in threads correctly stopping the program with an error message
Other fixes and features:
- Make --flat-nodes also work in non-slim mode, useful if memory is tight
- Fix off-by-one error in expire code generating out of bounds tiles
- Property changes are stored later to database to avoid changing the database if anything breaks
- Report (up to 100) missing nodes in the input file (in debug log)
- Simplified code for area assembly from multipolygon relations
- Replication: guess state from file when state info is not available
- Flush and close COPYs after nodes, ways, and relations in flex output avoid COPYs that are open for a long time
- Remove special case for old PostGIS versions when clustering
- Avoid looking for parents of new nodes and ways in the database middle speeding up changes
- As always: Lots of code cleanups, refactorings and small fixes
Release 1.11.0
This release makes the new middle database format the default. If you have not switched already, you need to reimport your database to take advantage of that.
We have changed the way we are parsing the command line options. The new code uses the CLI11 library (a copy of which is included in the repository) and is much cleaner and also much stricter. You now get warnings (and sometimes errors) for many combinations of options that don't make sense. Please check the output from osm2pgsql and osm2pgsql-replication for such messages and fix your command lines accordingly. Note especially that duplicated options are not allowed any more. This can happen, for instance, when using osm2pgsql-replication which adds the database connection parameters (such as -d
) when it calls osm2pgsql.
If all goes well this will be the last release starting with a 1. We are planning for a version 2.0.0 in the second quarter of 2024. In that release we will remove all the functionality that has been deprecated. We will also remove support for the legacy database middle format and only support the new format introduced in version 1.9.0.
Further changes:
- The number of database connections that osm2pgsql was opening could be quite large as it was depending on the number of tables. This is no longer the case. Osm2pgsql is opening far fewer connections now, usually you will not need to change the PostgreSQL
max_connections
settings any more. - Osm2pgsql now adds the context (the part of osm2pgsql responsible for a database connection) and the connection number to the application name used in the database connection. This allows you to better monitor what osm2pgsql is doing using the
pg_stat_activity
table in the database. - Bugfix: Using the new database format with
-x, --extra-attributes
did not work due to a wrong SQL command. This is fixed now.
Many thanks to Thunderforest who supported development of the features in this release.
Release 1.10.0
This is a relatively small but still important release.
The new middle table format has changed slightly: the tags
field can now be NULL
. This makes storage more efficient and indexing faster. The new middle format is now declared stable and production ready. To use it, use the command line option --middle-database-format=new
, in a future version of osm2pgsql this will become the new default. If you have used this option already with one of the 1.9.x versions of osm2pgsql you have to reload your database or use this SQL command to update the table: ALTER TABLE <name> ALTER COLUMN tags DROP NOT NULL;
, for <name>
use planet_osm_nodes
, planet_osm_ways
, and planet_osm_rels
or the equivalents if you are using a different table name prefix.
Other changes:
- Emit a warning that the flex output
area
type and theadd_row()
functions are deprecated if you use them. If you get this warning, read https://osm2pgsql.org/doc/tutorials/switching-from-add-row-to-insert/ . - Add first/last timestamps to expire tables. Having these timestamps allows various expire/updating strategies.
- The
docs
directory is now calledman
, because it only contains the man pages. All other docs are on the project web site. - Various improvements on the (still experimental) generalization code. The biggest change is that we switch from using the CImg to the OpenCV library which makes the code an order of magnitude faster.
Release 1.9.2
This release fixes a bug introduced in 1.9.0 with two-stage processing that will lead to crashes. If you are using any 1.9.x version, please upgrade to 1.9.2.
In one case we had some performance problems updating an osm2pgsql database with 1.9.1 due to the PostgreSQL query planning choosing a bad plan. This release contains a workaround for that problem.
We also improved the (experimental) generalizer code a bit:
- More information is shown in log level 'info', including some timing information.
- The Lua config
run_sql()
command now can have either a single SQL statement in thesql
field (as before) or a list of SQL commands. - For convenience, the Lua config
run_sql()
command now has an optionaltransaction
field which can be set totrue
to wrap the SQL commands in BEGIN/COMMIT. - The new
if_has_rows
fields on therun_sql()
command can be set to string with an SQL query. If that field is set, the SQL statement(s) in thesql
field is only run, if the SQL query returns at least one row. - Some performance improvements in low-level code in the generalizer.
Release 1.9.1
This release fixes some small issues with 1.9.0:
- Fix compatibility of osm2pgsql-replication with psycopg3
- Fix architecture-dependent double to integer conversion
- Some small code cleanups
Release 1.9.0
This release brings three new major features:
- a new osm2pgsql_properties table that saves command line options and reuses them on updates
- a new database middle saves raw OSM data in JSONB format and is explicitly designed to be queried by the user
- the new (and still experimental) osm2pgsql-gen adds geometry generalization to osm2pgsql (thanks @joto)
Other changes include:
- cleanup of schema handling
- tile expiry output into database tables
- a new
spherical_area()
function for flex config files to calculate the area of a (multi)polygon on the sphere. - when using the new database middle, the
--middle-with-nodes
option allows you to store all tagged nodes in the database (with their tags and location). - several improvements to osm2pgsql-replication to make it more flexible and better tested (thanks to @amandasaurus and @JakobMiksch)
- don't do multi-statement SQL queries to be compatible with the PgPool-II connection pooler.
Please note that this version drops support for implicit DB schema other than public. If you rely on implict user schemas or custom schema paths, you now must configure the schema to be used with the --schema
option.
To compile osm2pgsql some new libraries are needed, please see the README.md for details.
For more information on all new features and changes read the more extensive release notes for 1.9.0.
Release 1.8.1
This release contains some fixes and minor changes.
- Fix
osm2pgsql-replication
script so it works correctly with PostgreSQL schemas. - Don't process objects without tags in outputs in append mode. This should speed up updates a little bit.
- Count number of inserted rows and rows not inserted because of NOT NULL constraints for each table and log the numbers in debug mode.
- Remove some extra-verbose debug logging when using the
pole_of_inaccessibility()
function. - Flush output tables generated from nodes and ways tables earlier.
Release 1.8.0
The largest change is the addition of much more flexible index support in the flex output. The table definitions have a new (optional) field called indexes
now which takes a list of index definitions. If the field is not there, we fall back to what we did before and create a GIST index on the only/first geometry column of a table. But you can also define any kind of index you want: define which index method (BTREE, GIST, ...) to use on which columns, define WHERE
clauses and expression indexes and much more. See the flex-config/indexes.lua
Lua config for some usage examples and the manual for all the details. You can also force osm2pgsql to always build the id indexes which are normally only built in slim mode.
The gazetteer output and the command line option --with-forward-dependencies
are deprecated in this release and will be removed soon. They were only needed for Nominatim which switched to using the flex output recently.
Here are the other changes:
- Fix a problem when using osm2pgsql with a projection other than WGS84 (EPSG:4326) or Web Mercator (EPSG:3857) which made the program really slow.
- New
pole_of_inaccessibility()
Lua function to generate reasonably good label points from polygons. (This function is currently marked as experimental, which means it can change without notice at any time.) - Performance improvement for very small updates. Don't spin up multiple threads when there are less then 100 objects to process, because the extra overhead is not worth it.
- Implement and use our own JSON writer. This removes the dependency on RapidJSON which hasn't seen a new release since 2016.
- Add more checks (or does some checks earlier) to make sure your database uses UTF-8 encoding and that necessary database extensions are loaded and index methods, schemas and tablespaces you refer to in the config are actually available.
- A lot of code needed to be updated so it works correctly with any of the recent versions of the fmt library.
As always there were lots of code cleanups across the board, but especially in code accessing the database and in the C++/Lua glue code to make it more flexible and easier to use internally.
Release 1.7.2
This release has some small changes only:
- The flex output now allows tables with only the id column (or columns).
- The
osm2pgsql-replication
script now always expects theosm2pgsql
binary to be in same path as itself. - Adds the flag
--middle-schema=SCHEMA
to theosm2pgsql-replication
script which allows placing the replication status table in a schema other than PUBLIC (Thanks to @JakobMiksch). - More tests have been converted to the new BDD format.
- Various code cleanups and refactorings especially in the expire code.
Release 1.7.1
This release fixes a few small bugs in osm2pgsql and closes some gaps in the geometry processing code released in 1.7.0. It also contains some security-related fixes as a result of the security audit.
- Added
as_multipoint()
function to complementas_multilinestring()
andas_multipolygon()
. - The functions
as_multipoint()
,as_multilinestring()
, andas_multipolygon()
will now always return single geometries if possible. Single geometries are always allowed where multi geometries are allowed, so this does't break anything. - The
centroid()
function now works for all geometry types. - New
length()
function to compute the length of a geometry in map units. - New
reverse()
function to turn geometries around (can be useful for ways tagged withoneway=-1
). - The
simplify()
function is now available for multilinestrings, too. (Not for polygons yet.) - All example code in the
flex-config
directory has been updated for the new geometry handling capabilities. - Create nicer error messages when trying to access a missing database extension, schema, or tablespace.
- Better checking of names (of tables, schemas, etc.) used in SQL in osm2pgsql and osm2pgsql-replication to avoid potentional SQL injection issues.
- Fix: Make sure relation members show up in the correct order in multi-geometries when using slim mode.
- Fix: Do not try to run
ST_IsValid()
oncreate_only
columns. - osm2pgsql-replication: The database parameter may be empty when connection parameters are supplied via environment variables.
- osm2pgsql-replication: when installed, now runs the osm2pgsql binary that was installed with it to avoid potential security issues through PATH manipulation.
- osm2pgsql-replication: Meaningful error when middle tables do not exist or the prefix is a bad one.