What happens if a Kafka Consumer instance dies?

2019-09-14 17:24发布

  • Kafka Broker has 3 partitions.
  • Kafka Consumer instance' count is 3.
  • Suddenly, one Consumer instance died.

I know that if a Kafka Consumer instance dies, the Kafka Broker is rebalancing and another consumer instance gets allocated to that partition.

I wonder if it is correct to assume that another instance consumes all of the partition it originally consumes and then allocates and consumes dead partitions.

(And do I have to implement ConsumerRebalanceListener in client code?)

If this is the case, can there be any delay in consuming the message?

Thank you.

2条回答
家丑人穷心不美
2楼-- · 2019-09-14 18:04

The default partition assignment strategy is RangeAssignor. For each subscribed topic, this strategy:

  • Sorts partitions into numeric order.
  • Sorts consumers into lexicographic order.
  • Tries to assign an equal number of partitions to each consumer. If the number of consumers does not divide evenly into the number of partitions, then the first few consumers will have an extra partition.

At the beginning of your example, there are

  • partitions 0, 1, 2
  • consumers A, B, C

This strategy assigned:

  • consumer A: partition 0
  • consumer B: partition 1
  • consumer C: partition 2

Suppose consumer C dies. Rebalancing executes this strategy on

  • partitions 0, 1, 2
  • consumers A, B

The strategy assigns:

  • consumer A: partition 0, 1
  • consumer B: partition 2

So in this scenario, the set of partitions assigned to consumer B after rebalancing does not contain the partition assigned to it before rebalancing.

查看更多
姐就是有狂的资本
3楼-- · 2019-09-14 18:05

If I am not mistaken rebalancing will interrupt the processing of your different consumers.

If you are commiting your offset at the end of each batch, that means that all the data that had already been processed in your batch will be reprocessed again.

To avoid that you can either use consumer.commitAsync() that allows you to commit your offset in the middle of a batch processing, or implement ConsumerRebalanceListener as you infered.

public void onPartitionsRevoked(Collection<TopicPartition> partitions)

Called before the rebalancing starts and after the consumer stopped consuming messages. This is where you want to commit offsets, so whoever gets this partition next will know where to start.

From Kafka the definitive Guide

And to answer your last question : Yes, rebalancing implies a delay in consuming messages.

查看更多
登录 后发表回答