Just like the above image, if Consumer A consumes from a Partition and Consumer B from a Consumer Group consumes from the same Partition, how does Kafka manage offset in __consumer_offsets
?
I want to know how Kafka writes consumer offsets in .index
, .log
, .timeindex
files.
If there are more than one consumer consuming the topic (consumer in a consumer group), Kafka keeps the offset like the below. (I just like images, they help me understand better :) )
In __consumer_offsets-, there are files like below. That's where the offset are recorded.
Index file and Log file contain the following (an example) : Timeindex file is used like index file helping Kafka quickly find messages on the disk.
Lastly, the below diagram can give you an idea how the offset is stored in the log file.
All images are from Google.
Quote from docs that can be found here: https://kafka.apache.org/documentation/#impl_offsettracking
So, it is per consumer group, not per consumer. Also, this article can be helpful for you: https://medium.com/@felipedutratine/kafka-consumer-offsets-topic-3d5483cda4a6 If you read from consumer offsets topic you will receive data in format
[groupId,topicName,partitionNumber]
.