I have an AWS infrastructure of
- 1 Kafka cluster on 3 docker containers, running on ECS, using EFS as storage service (for simplicity).
- 1 Kafka Streams application cluster, on 3 containers.
There is a source topic "events" with 16 partitions, replication 2. A PAPI topology processor "stream-processor" produces output to some other topics and uses 3 state stores.
I can see via Kafka Manager that data is consumed, and output is produced to these other output topics. Apparently it works (though slow).
But looking at consumer offsets via bin/kafka-consumer-groups.sh, I can see that only one of the partitions is being consumed simultaneously over time. In separated, consecutive runs of the command, only one of the offsets is reduced.
First execution:
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
events 6 - 4021552 - stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 13 5030392 5030541 149 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 2 7056462 7056462 0 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 8 671945 6046546 5374601 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 1 164123 3009191 2845068 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 12 1962842 11052506 9089664 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 5 - 4022059 - stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 0 - 4019992 - stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 4 - 5032053 - stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 11 5037439 5037584 145 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 15 1683056 5034689 3351633 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 7 164702 7052434 6887732 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 14 - 3011069 - stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 3 1927601 6044400 4116799 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 10 5031461 5031612 151 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 9 1686979 8052924 6365945 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
Second execution: Only the partition 8 has advanced its offset. After 1, 5 or 15 minutes, that one is the only partition consumed.
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
events 6 - 4021552 - stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 13 5030392 5030541 149 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 2 7056462 7056462 0 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 8 686685 6046546 5359861 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 1 164123 3009191 2845068 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 12 1962842 11052506 9089664 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 5 - 4022059 - stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 0 - 4019992 - stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 4 - 5032053 - stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 11 5037439 5037584 145 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 15 1683056 5034689 3351633 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 7 164702 7052434 6887732 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 14 - 3011069 - stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 3 1927601 6044400 4116799 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 10 5031461 5031612 151 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
events 9 1686979 8052924 6365945 stream-processor-fa36-StreamThread-1-consumer-6cc2 /same.ip.here stream-processor-fa36-StreamThread-1-consumer
Looking at the logs, only one of the instances is printing logs at the same time. I.e. if one is working, the other two are not.
What could be the problem here?
Kafka & Kafka Streams versions 1.1.