Kafka topic no longer exists after restart

2019-02-28 14:34发布

问题:

I created a topic in my local kafka cluster with 3 servers / brokers by running the following from my kafka installation directory

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 2 --partitions 2 --topic user-activity-tracking-pipeline

Everything worked fine as I was able to produce and consumer messages from my topic. After restarting my machine, I started bundled zookeeper from kafka installation directory by running the following in the terminal

bin/zookeeper-server-start.sh config/zookeeper.properties

Started 3 servers belonging to the cluster by executing the following in terminal from kafka installation directory

env JMX_PORT=10001 bin/kafka-server-start.sh config/server1.properties
env JMX_PORT=10002 bin/kafka-server-start.sh config/server2.properties
env JMX_PORT=10003 bin/kafka-server-start.sh config/server3.properties

Now, when I list available topics by running the following in terminal from kafka installation directory,

bin/kafka-topics.sh --zookeeper localhost:2181 --list

result is empty!

Here are the relevant server 1 configuration entries. The values for server 2 and server 3 are quite similar

broker.id=1
listeners=PLAINTEXT://:9093
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs-broker-1
num.partitions=2
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
log.cleaner.enable=false
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000

I do notice log files under after restart so nothing was cleaned up

/tmp/kafka-logs-broker-1
/tmp/kafka-logs-broker-2
/tmp/kafka-logs-broker-3

I am wondering why the previously created topic "user-activity-tracking-pipeline" doesn't exist any more when I try to list it?

回答1:

kafka-topics.sh actually uses zookeeper data under the hood to answer the query. The rationale being that a single broker generally can't have enough information by itself to describe topics completely.

If you lost (which I suspect you did, since you mention a new zookeeper start) zookeeper data during your restart process, kafka-topics is now totally blind and can't see former kafka data.

The best way to check what's happening is to actually do what kafka is doing when you query it ! Launch your zookeeper client (it's as simple as doing ./zkCli.sh, and type ls /brokers/topics. If it's empty, your ZK data is lost.



回答2:

I think you've run into the problem of the /tmp directory being cleaned out whenever your computer reboots. You either need to change the directory that you're storing your Kafka logs to, or change the $TMPTIME environmental variable in /etc/default/rcS which controls how long the tmp files are kept around (time in days).

https://askubuntu.com/questions/20783/how-is-the-tmp-directory-cleaned-up



回答3:

It doesn't mean if you have the broker directories on temp that you had data on it, brokers tends to create these directories if they doesn't exist.

  • Could you try to recreate the topic, restart the machine then have a look on the /tmp directory before starting Kafka ?
  • Could you try to reproduce with changing the data directory to something other than /tmp ?