Kafka Proper Way to Poll No Records

2019-07-29 17:57发布

问题:

for keeping my consumer alive (very long variable length processing) I'm implementing a empty poll() call in a background thread that will keep the broker from rebalancing if I spend too much time between polls(). I have set my poll-interval to be very long, but I don't want to just keep increasing it forever for longer and longer processing.

What's the proper way to poll for no records? Currently I'm calling poll(), then re-seeking back to the earliest offsets for each partition returned in the poll call() so they can be read properly by the main thread once it's done processing the previous messages.

ConsumerRecords<String, String> msgs = kafkaConsumer.poll(timeout);
Map<Integer, Long> partitionToOffsets = getEarliestPartitionOffsets(msgs); // helper method
seekToOffsets(partitionToOffsets);

回答1:

The proper way to handle long processing time (and avoiding consumer rebalance) is to use KafkaConsumer.pause() / KafkaConsumer.resume() methods. You can read more about it here:

  • KafkaConsumer JavaDoc
  • Apache Kafka JIRA