If you have less consumers than partitions, what h

2019-03-24 13:21发布

If you have less consumers than partitions, does that simply mean you will not consume all the messages on a given topic?

In a cloud environment, how are you suppose to keep track how many consumers are running and how many are pointing to a given topic#partition?

What if you have multiple consumers on a given topic#partition? I guess the consumer has to somehow keep track of what messages it has already processed in case of duplicates?

2条回答
干净又极端
2楼-- · 2019-03-24 13:41

1) No that means you will one consumer handling more than one consumer. 2) Kafka never assigns same partition to more than one consumer because that will violate order guarantee within a partition. 3) You could implement ConsumerRebalanceListener, in your client code that gets called whenever partitions are assigned or revoked from consumer.

You might want to take a look at this article specically "Assigning partitions to consumers" part. In that i have a sample where you create topic with 3 partitions and then a consumer with ConsumerRebalanceListener telling you which consumer is handling which partition. Now you could play around with it by starting 1 or more consumers and see what happens. The sample code is in github

http://www.javaworld.com/article/3066873/big-data/big-data-messaging-with-kafka-part-2.html

查看更多
等我变得足够好
3楼-- · 2019-03-24 13:50

In fact, each consumer belongs to a consumer group. When Kafka cluster sends data to a consumer group, all records of a partition will be sent to a single consumer in the group.

If there're more paritions than consumers in a group, some consumers will consume data from more than one partition. If there're more consumers in a group than paritions, some consumers will get no data. If you add new consumer instances to the group, they will take over some partitons from old members. If you remove a consumer from the group (or the consumer dies), its partition will be reassigned to other member.

Now let's take a look at your questions:

If you have less consumers than partitions, does that simply mean you will not consume all the messages on a given topic?

NO. Some consumers in the same consumer group will consume data from more than one partition.

In a cloud environment, how are you suppose to keep track how many consumers are running and how many are pointing to a given topic#partition?

Kafka will take care of it. If new consumers join the group, or old consumers dies, Kafka will do reblance.

What if you have multiple consumers on a given topic#partition?

You CANNOT have multiple consumers (in a consumer group) to consume data from a single parition. However, if there're more than one consumer group, the same partition can be consumed by one (and only one) consumer in each consumer group.

查看更多
登录 后发表回答