-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osmdbt and pgoutput #38
Comments
I couldn't find any real documentation on the pgoutput plugin. I am all for using something that's already there instead our own implementation, but is it intended as something that "the public" can use or as something internal to PostgrSQL? We don't want to switch and then they change their internal representation or something and our code breaks? |
Debezium seems to be one of the more prominent external consumers interfacing directly with pgoutput. This is matching our use case, with Apache Kafka as a destination, rather than some text files. The binary format itself is documented on the postgresql.org page: Logical Replication Message Formats Since pgoutput typically supports multiple versions of its binary protocol, clients can explicitly request one particular version when connecting to the database. In the case of osmdbt-pgoutput that's version 1. As long as future Postgresql versions still support this version, we're good. I've noticed some minor differences across different Postgresql versions, such as omitting an empty BEGIN / COMMIT pair. From a functional point of view, this has no impact. However, some unit tests that are relying on number of rows might see different results here. I've already considered this point in the test cases. Some links on how different projects are using pgoutput:
There's probably much more out there. If I find more interesting links, I will add them to the list. |
I'm moving my comment to a new issue, as requested by @joto
Since compiling and deploying a custom plugin seemed a bit cumbersome, I've been exploring the option to use
pgoutput
, a fast binary format-based plugin that's built into Postgresql.pgoutput is the standard logical decoding output plug-in in PostgreSQL 10+. It is maintained by the PostgreSQL community, and used by PostgreSQL itself for logical replication. This plug-in is always present so no additional libraries need to be installed.1
osmdbt-pgoutput
interprets the raw replication event stream directly and translates it into the same text representation like theosm-logical
plugin today. Most of what osm-logical plugin has been doing before has moved to osmdbt-get-log.cpp and pgoutput.[ch]pp. All command line tools should work like before. Configuration wise, a new database parameterpublication
was added to the osmdbt.yaml file.Maybe in a long term, this approach could simplify our setup, or make it easier to use osmdbt in cloud environments with limited options for deploying custom plugins.
Link: https://github.com/mmd-osm/osmdbt-pgoutput
Footnotes
Quoting https://debezium.io/documentation/reference/stable/connectors/postgresql.html ↩
The text was updated successfully, but these errors were encountered: