PostgreSQL logical log stream for Apache Kafka

I wonder if this is possible, or if someone tried to configure Apache Kafka as a consumer of the PostgreSQL log logical stream? Does it even make sense?

I have an original source system from which I need to make a live dashboard. For some reason I am unable to hook up application events (btw, it java app). Instead, I am thinking of some kind of lambda architecture: when the dashboard is initialized, it is read from the stored data "data store" that comes after some ETL. And then the event changes are pushed through Kafka to the dashboard.

Another use of events stored in Kafka would be a kind of data selection approach for storing data. This is necessary because there is no commercial CDC tool that supports postgesql. And the original application updates the tables without saving history.


source to share

3 answers

The combination of xsteven PostgreSQL WAL with the protobuf project - decoderbufs ( ) - and its producer pg_kafka ( ) could be a start.



Check out Bottled Water , which:

uses the boolean decode feature (introduced in PostgreSQL 9.4) to retrieve a consistent snapshot and continuous stream of change events from the database. Data is retrieved at the row level and encoded using Avro. the client program connects to your database, fetches this data and passes it to Kafka

They also have Docker images, so it looks like it would be easy to try.



The Debezium project provides a CDC connector for streaming data changes from Postgres to Apache Kafka. It currently supports Decoderbufs and wal2json as logic decode plugins. The bottled water Steve refers to is comparable, but no longer supported.

Disclaimer: I am Debezium project manager



All Articles