I need to find out a way to ask Kafka for a list of topics. I know I can do that using the kafka-topics.sh
script included in the bin\
directory. Once I have this list, I need all the consumers per topic. I could not find a script in that directory, nor a class in the kafka-consumer-api
library that allows me to do it.
The reason behind this is that I need to figure out the difference between the topic's offset and the consumers' offsets.
Is there a way to achieve this? Or do I need to implement this functionality in each of my consumers?
Kafka stores all the information in zookeeper. You can see all the topic related information under brokers->topics. If you wish to get all the topics programmatically you can do that using Zookeeper API.
It is explained in detail in below links
Tutorialspoint, Zookeeper Programmer guide
Use kafka-consumer-groups.sh
For example
bin/kafka-consumer-groups.sh --list --bootstrap-server localhost:9092
bin/kafka-consumer-groups.sh --describe --group mygroup --bootstrap-server localhost:9092
you can use this for 0.9.0.0. version kafka
./kafka-consumer-groups.sh --list --zookeeper hostname:potnumber
to view the groups you have created. This will display all the consumer group names.
./kafka-consumer-groups.sh --describe --zookeeper hostname:potnumber --describe --group consumer_group_name
To view the details
GROUP, TOPIC, PARTITION, CURRENT OFFSET, LOG END OFFSET, LAG, OWNER
High level consumers are registered into Zookeeper, so you can fetch a list from ZK, similarly to the way kafka-topics.sh
fetches the list of topics. I don't think there's a way to collect all consumers; any application sending in a few consume requests is actually a "consumer", and you cannot tell whether they are done already.
On the consumer side, there's a JMX metric exposed to monitor the lag. Also, there is Burrow for lag monitoring.