Too many TCP Connections are in CLOSE_WAIT status in a kafka broker causing DisconnectionException in kafka clients.
tcp6 27 0 172.31.10.143:9092 172.31.0.47:45138 ESTABLISHED -
tcp6 25 0 172.31.10.143:9092 172.31.46.69:41612 CLOSE_WAIT -
tcp6 25 0 172.31.10.143:9092 172.31.0.47:45010 CLOSE_WAIT -
tcp6 25 0 172.31.10.143:9092 172.31.46.69:43000 CLOSE_WAIT -
tcp6 194 0 172.31.10.143:8080 172.31.20.219:45952 CLOSE_WAIT -
tcp6 25 0 172.31.10.143:9092 172.31.20.219:48006 CLOSE_WAIT -
tcp6 1 0 172.31.10.143:9092 172.31.0.47:44582 CLOSE_WAIT -
tcp6 25 0 172.31.10.143:9092 172.31.46.69:42828 CLOSE_WAIT -
tcp6 25 0 172.31.10.143:9092 172.31.46.69:41934 CLOSE_WAIT -
tcp6 25 0 172.31.10.143:9092 172.31.46.69:41758 CLOSE_WAIT -
tcp6 25 0 172.31.10.143:9092 172.31.46.69:41584 CLOSE_WAIT -
tcp6 25 0 172.31.10.143:9092 172.31.46.69:41852 CLOSE_WAIT -
tcp6 1 0 172.31.10.143:9092 172.31.0.47:44342 CLOSE_WAIT -
Error in debezium
connect-prod | 2019-02-14 06:28:54,885 INFO || [Consumer clientId=consumer-3, groupId=4] Error sending fetch request (sessionId=1727876188, epoch=INITIAL) to node 2: org.apache.kafka.common.errors.DisconnectException. [org.apache.kafka.clients.FetchSessionHandler] connect-prod | 2019-02-14 06:28:55,448 INFO || [Consumer clientId=consumer-1, groupId=4] Error sending fetch request (sessionId=1379896198, epoch=INITIAL) to node 2: org.apache.kafka.common.errors.DisconnectException. [org.apache.kafka.clients.FetchSessionHandler]
What can be the reason behind this?
It appears that this is a known issue in Kafka 2.1.0.
https://issues.apache.org/jira/browse/KAFKA-7697
I think the connections stuck in Close_wait is a side effect of the real problem.
This issue has been fixed in Kafka version 2.1.1 which should be released in a few days. Looking forward to it.