Kafka Stream: bring it to the topic first or transfer it?

Question

Kafka Stream: bring it to the topic first or transfer it?

A large number of articles describe the implementation of using Kafka streams, where they are output to a new Kafka topic instead of being saved to some kind of distributed database.

Is this just a usable case, making the assumption that inline db + interactive queries are sufficient, or is there some architectural reason why you would like to drop the theme before using it again to keep it instead of?

I'm not sure if this matters, but the context of the examples I'm looking at is time-window collapsing.

+3

persistence apache-kafka apache-kafka-streams

Asmodean June 18 17 at 19:04

source to share

1 answer

Michal borowiecki · Accepted Answer · 2017-06-18T20:10:43+0000

If you only want to fetch data from kafka and store it in db, then Kafka Connect is the most natural way to go.

On the other hand, if your primary use case is doing aggregation, then indeed Kafka Streams is a simple and elegant way to get around this. And if Kafka Connect already exists for your preferred database, then it will be very easy for Kafka Streams to write to the topic, and then its Kafka Connect sink collects and stores in your db. If there is no ready-made receiver and you have to write one and you don't think it will be reused, then you can just write it as a regular Kafka Streams processor and not have a Kafka output theme.

As you can see, there are different ways to navigate depending on your use case and your preference. There is no right way, so please consider the tradeoffs involved.

Kafka Stream: bring it to the topic first or transfer it?

More articles: