Kafka 0.11.0.0 keeps reseting offset on restart

2019-02-28 16:54发布

问题:

I have a problem with Kafka 0.11.0.0

When I create new topic, put come data into it and consume it with a java consumer, after restarting Kafka 0.11.0.0 offsets for my consumer group disappear. Topic stays and it has the same data in it, only offsets get purged. This makes consumer download all records from topics again. What is weird, only one topic have its old, correct offsets, all other offsets get deleted, maybe because that one topic was there for a while.

I commit all consumed records with commitSync(). Offset is then saved on my broker, I can restart my java consumer and it starts from correct offset but after restarting entire Kafka the offset for consumer groups resets to 0. I check current commits before consuming after restart with kafka-consumer-groups.sh script and definitely it's the broker who resets them.

I had no problem with this in Kafka 0.10.2.1. I experience this problem only in 0.11.0.0 version.

My consumer has auto.offset.reset set to earliest, auto commit is set to false because I'm committing manually. Kafka data is stored in non-tmp directory with necessary permissions. The rest of broker configuration is default.

I need 0.11.0.0 version for transactions. I have no idea where the problem can be. What can be a cause for this? Is there new config parameter I missed somewhere?

@Edit That topic which stays also has problems with offsets, however it doesn't get entirely purged but the offset after restarting isn't correct and consumer gets around ~15% of its data after every broken restart.

@Edit2 Sometimes but not always my server.log is full of:

WARN Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch. This implies messages have arrived out of order. New: {epoch:4, offset:1669}, Current: {epoch:5, offset1540} for Partition: __consumer_offsets-26 (kafka.server.epoch.LeaderEpochFileCache)

It seems like it's connected to my consumer group because of another logs:

[2017-08-22 08:59:30,719] INFO [GroupCoordinator 0]: Preparing to rebalance group scrapperBackup with old generation 119 (__consumer_offsets-26) (kafka.coordinator.group.GroupCoordinator)
[2017-08-22 08:59:30,720] INFO [GroupCoordinator 0]: Group scrapperBackup with generation 120 is now empty (__consumer_offsets-26) (kafka.coordinator.group.GroupCoordinator)

There are always logs like this one on restart:

[2017-08-22 09:15:37,948] INFO Partition [__consumer_offsets,26] on broker 0: __consumer_offsets-26 starts at Leader Epoch 6 from offset 1699. Previous Leader Epoch was: 5 (kafka.cluster.Partition)

@Edit3 Creating new directory for Kafka/Zookeeper data and creating everything from scratch helped. I don't know what was the problem, but it works now properly. It seems that some error occurred in data directories of apps.

回答1:

If you experience this problem, download new version 0.11.0.1 of Kafka. This problem was fixed in that version.

This explains this bug: https://issues.apache.org/jira/browse/KAFKA-5600