Why Kafka care about the hostname?

2020-05-03 10:38发布

问题:

I did a test with code below to send data to topic. The kafka is

kafka_2.12-1.1.0

The code are

import kafka
print(kafka.version.__version__)
from kafka import KafkaProducer 
producer = KafkaProducer(
    bootstrap_servers=['172.25.44.238:9092'],
    sasl_mechanism="PLAIN",
    api_version=(0, 10),
    retries=2
)
f = producer.send("test", "some")
f.get()

If I change the server config like this:

listeners=PLAINTEXT://172.25.44.238:9092

Then my code can send data to my topic

If I change the server config like this which is default:

listeners=PLAINTEXT://:9092

Then my code will hit error:

kafka.errors.KafkaTimeoutError: KafkaTimeoutError: Batch for TopicPartition(topic='test', partition=0) containing 1 record(s) expired: 30 seconds have passed since batch creation plus linger time

The difference is that the sencond will use hostname by default. And yes my machine running the producer code can not reslove the kafka hostname. But I did not use the hostname in producer code either. So it should not cause the error. So why the hostname matter?

回答1:

I think you're misunderstanding the concept of "bootstrapping".

The address you provide only establishes initial connection. The address the clients actually use is defined by the advertised.listeners.

The listeners should always be ://0.0.0.0, in my opinion, then you use OS level firewall settings to restrict access. Yes, the default is the hostname, and this means only that host can communicate with the broker