Splitting a Kafka consumer group

enter image description here

I read the Kafka wiki, have some problems with this picture.

  • For consumer group A, C1, C2 can only receive two section messages, for example, only C1 accepts only P0, C2 only receives P1?

  • As I know, one user group displays one topic, so C1, C2 must have the same topic, so PO, P1, P2, P3 have the same topic, right?

  • So there is one contradiction, if problem 2 is correct, then consumer group A and consumer group B have the same topic, so the contradiction in relation to one consumer group reflects the same topic.

  • How does C1 manage the message P0, P1, if P0, P1 has the same subject, that means C1 will get a duplicate message, if not, how does C1 manage another message with only one offset?

  • The question is about "All sections contain the same topic, at least the way I interpret this picture." So suppose in the same thread called "test", then one vendor creates a "Hello test" message for that thread, does that mean C1, C2, C3, C4 will all receive the same message? And for the fourth answer, C1 still gets "Hello test" twice?

  • Can CG-A or CG-B receive different thread messages?

  • I didn’t see any advantage in relation to the user group: "Sometimes the logic for reading messages from Kafka doesn’t care about handling message offsets, it just needs data. So the high-level consumer is presented with the abstract most of the details of consuming events from Kafka." from the Kafka wiki, can you give me an example for the consumer group about this image, for example, you populated CG-A by reporting a task and CG-B monitors?

  • means P0, P1, P2, P3 from one thread named "test" will send another message? but I followed the Kafka wiki, for example:

    and. bin / kafka-server-start.sh config / server.properties

    b. bin / kafka-topics.sh --create --zookeeper localhost: 2181 - repetition factor 1 - sections 3 - topical test // sections 3

    from. bin / kafka-console-producer.sh - localhost: 9092 --topic test

    bin / kafka-console-consumer.sh --zookeeper localhost: 2181 - mock test - from the beginning

    Then I print something in the producer, then will the consumer show this message?

So how can these three sections have different messages?

  1. Finally, how can I use the command line to simulate this image? Create a Consume Group, then assign some consumer to it, then create a message and can I track the mapping relationship between Partition and Consumer and send the message from which section?

Many thanks

+3


source to share


2 answers


  • In the figure, both consumer group A and consumer group B read all 4 sections. C1 → [P0, P3], C2 → [P1, P2] Imagine that there is a problem with C1 and this consumer ends. Then C2 will take over the two remaining partitions and the mapping becomes C2 → [P0, P1, P2, P3] Imagine that you fix the problem, restart C1, and also add a third consumer C3 to the same group. Then you will have a mapping like C1 → [P0], C2 → [P1, P2], C3 → [P3]

  • The concept of topics is somewhat different from partitioning and could be a list of topics consumed by a consumer group, but for simplicity, the image is probably intended to display only one topic consumed by two independent consumer groups.We could imagine CG-A doing something- it is easy to post and can only manage it with two instances, while CG-B does more complex processing and requires more parallelism. They can also have different time constraints, so that CG-B may be more of a real-time consumer (eg direct monitoring), while CG-A may have less real-time constraints (eg reporting services). All sections contain the same topic, at least how I interpret this picture.

  • No contradiction, Kafka is a multi-subscriber exchange system. You can have as many consumer groups consuming the same topic as you like, independently of each other.

  • A single message only exists in one of the sections, so no duplicate messages will be received. For redundancy, Kafka also has replication functionality, but this is a different concept for partitions. Replication is not shown in the picture, but that would mean that you have something like [P0_leader, P1_follower, P2_follower, P3_leader] on server 1 and [P0_follower, P1_leader, P2_leader, P3_follower] on server 2.



+3


source


Below are some test results for group ID and customer group

Properties props = new Properties();
      //set all other properties as required
      props.put("group.id", "ConsumerGroup1");
      props.put("max.poll.records", "1");
      KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);

      

user.group id - load balance of received data ( if group.id is different for each consumer, each consumer will receive a copy of the data )

if partition = 1 and total number of consumers = 2, only one of the two active consumers will receive the data



if partition = 2 and total number of consumers = 2, each of the two active consumers receives data evenly

if partition = 3 and total number of consumers = 2, each of the two active consumers will receive data. one consumer gets data from 2 partitions and the other gets data from 1 partition.

if partition = 3 and total number of consumers = 3, each of the three active consumers receives data evenly.

0


source







All Articles