Consumer group member has no partition

2019-08-21 03:42发布

问题:

I launch two consumers on the same consumer group, i subscribe to 20 topics (each has only one partition)

Only on consumer is used :

kafka-consumer-groups --bootstrap-server XXXXX:9092 --group foo --describe --members --verbose

Note: This will not show information about old Zookeeper-based consumers.

CONSUMER-ID                                  HOST            CLIENT-ID       #PARTITIONS     ASSIGNMENT
rdkafka-07cbd673-6a16-4d55-9625-7f0925866540 /xxxxx rdkafka         20              arretsBus(0), capteurMeteo(0), capteurPointMesure(0), chantier(0), coworking(0), horodateur(
0), incident(0), livraison(0), meteo(0), metro(0), parkrelais(0), qair(0), rhdata(0), sensUnique(0), trafic(0), tramway(0), tweets(0), voieRapide(0), zone30(0), zoneRencontre(0)
rdkafka-9a543197-6c97-4213-bd59-cb5a48e4ec15 /xxxx    rdkafka         0 

What i do wrong ?

回答1:

Ok, I did some reading around such behavior, and it's interesting to know why it happens. There are two kinds of partition assignment strategy in Kafka.

  • Range: Assigns to each consumer a consecutive subset of partitions from each topic it subscribes to. So if consumers C1 and C2 are subscribed to two topics, T1 and T2, and each of the topics has three partitions, then C1 will be assigned partitions 0 and 1 from topics T1 and T2, while C2 will be assigned partition 2 from those topics. Because each topic has an uneven number of partitions and the assignment is done for each topic independently, the first consumer ends up with more partitions than the second. This happens whenever Range assignment is used and the number of consumers does not divide the number of partitions in each topic neatly.

  • RoundRobin: Takes all the partitions from all subscribed topics and assigns them to consumers sequentially, one by one. If C1 and C2 described previously used RoundRobin assignment, C1 would have partitions 0 and 2 from topic T1 and partition 1 from topic T2. C2 would have partition 1 from topic T1 and partitions 0 and 2 from topic T2. In general, if all consumers are subscribed to the same topics (a very common scenario), RoundRobin assignment will end up with all consumers having the same number of partitions (or at most 1 partition difference).

The default strategy is Range, which explains why you are seeing such partition distribution.

So, I did a small experiment. I created two console consumers each listening to topics test1, test2, test3, test4 and each topic has only one partition. As expected consumer-1 was assigned all partitions.

Then I changed the partitioning strategy to org.apache.kafka.clients.consumer.RoundRobinAssignor and passed it to both the console-consumers, and voila, both consumers now gets 2 partitions each.

UPDATE: Oops didn't see it was already answered couple of minutes back.



回答2:

In Kafka, a topic/partition could only be consumed by at most one consumer in a consumer group, to avoid race contention between consumers.



回答3:

In Apache Kafka, the partitions number defines the level of parallelism you want in terms of consumers in the same consumer group; it means that two consumer as part of the same consumer group cannot read from the same partition. In your case you have topic with just one partition which will be assign to only one consumer and the other one will be just idle waiting for a rebalancing: it means that if the first consumer disconnect, the second one will move from idle to consuming the partition. If your expectation is getting 10 topics for each consumer it's not how Apache Kafka works. As I said the parallelism unit is the partition in the topic and not the topic itself.



回答4:

Ok i found the probleme, it's work with :

'partition.assignment.strategy': 'roundrobin'

CONSUMER-ID                                  HOST            CLIENT-ID       #PARTITIONS     ASSIGNMENT
rdkafka-fa7ec1ca-1c34-498b-bd22-24ad6ca99645 /XXXX  rdkafka         10              capteurPointMesure(0), meteo(0), metro(0), parkrelais(0), qair(0), sensUnique(0), tweets(0),
 voieRapide(0), zone30(0), zoneRencontre(0)
rdkafka-89f765b6-2014-4b8c-bef2-c6406763118b /XXXX    rdkafka         10              arretsBus(0), capteurMeteo(0), chantier(0), coworking(0), horodateur(0), incident(0), livrai
son(0), rhdata(0), trafic(0), tramway(0)

The range strategy work per topic, with roundrobin i have the expected result.