The system on which my Kafka server is running has two NICs, one with a public IP (135.220.23.45) and the other with a private one (192.168.1.14). The private NIC is connected to a subnet composed of 7 machines in total (all with addresses 192.168.1.xxx). Kafka has been installed as a service using HDP and has been configured with zookeeper.connect=192.168.1.14:2181 and listeners=PLAINTEXT://192.168.1.14:6667. I have started a consumer on the system that hosts the kafka server using: [bin/kafka-console-consumer.sh --bootstrap-server 192.168.1.14:6667 --topic test --from-beginning].
When I start producers (using [bin/kafka-console-producer.sh --broker-list 192.168.1.14:6667 --topic test]) on any of the machines on the private subnet the messages are received normally by the consumer.
I would like to start producers on public systems and receive the messages by the consumer running on the kafka server. I believed that this could be achieved by IP masquerading and by forwarding all external requests to 135.220.23.45:15501 (I have chosen 15501 to receive kafka messages) to 192.168.1.14:6667. To that extend I setup this port forwarding rule on firewalld: [port=15501:proto=tcp:toport=6670:toaddr=192.168.1.14].
However, this doesn’t seem to work since when I start a producer on an external system with [bin/kafka-console-producer.sh --broker-list 135.220.23.45:15501 --topic] the messages cannot be received by the consumer.
I have tried different kafka config settings for listeners and advertised.listeners but none of them worked. Any help will be greatly appreciated.
You need to define different endpoints for your internal and external traffic in order for this to work. As it is currently configured, when you connect to 135.220.23.45:15501 Kafka would reply with "please talk to me on 192.168.1.14:6667 which is not reachable from the outside and everything from there on out fails.
With KIP-103 Kafka was extended to cater to these scenarios by letting you define multiple endpoints.
Full disclosure, I have not yet tried this out, but something along the following lines should at least get you started down the right road.
advertised.listeners=EXTERNAL://135.220.23.45:15501,INTERNAL://192.168.1.14:6667
inter.broker.listener.name=INTERNAL
listener.security.protocol.map=EXTERNAL:PLAINTEXT,INTERNAL:PLAINTEXT
Update:
I've tested this on a cluster of three ec2 machines out of interest. I've used the following configuration:
# internal ip: 172.31.61.130
# external ip: 184.72.211.109
listeners=INTERNAL://:9092,EXTERNAL_PLAINTEXT://:9094
advertised.listeners=INTERNAL://172.31.61.130:9092,EXTERNAL_PLAINTEXT://184.72.211.109:9094
listener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL_PLAINTEXT:PLAINTEXT
inter.broker.listener.name=INTERNAL
And that allowed me to send messages from both an internal machine as well as my laptop at home:
# Create topic
kafka-topics --create --topic testtopic --partitions 9 --replication-factor 3 --zookeeper 127.0.0.1:2181
# Produce messages from internal machine
[ec2-user@ip-172-31-61-130 ~]$ kafka-console-producer --broker-list 127.0.0.1:9092 --topic testtopic
>internal1
>internal2
>internal3
# Produce messages from external machine
➜ bin ./kafka-console-producer --topic testtopic --broker-list 184.72.211.109:9094
external1
external2
external3
# Check topic
[ec2-user@ip-172-31-61-130 ~]$ kafka-console-consumer --bootstrap-server 172.31.52.144:9092 --topic testtopic --from-beginning
external3
internal2
external1
external2
internal3
internal1