Kafka producer to read data files

2019-03-11 21:46发布

I am trying to load a data file in loop(to check stats) instead of standard input in Kafka. After downloading Kafka, I performed the following steps:

Started zookeeper:

bin/zookeeper-server-start.sh config/zookeeper.properties

Started Server:

bin/kafka-server-start.sh config/server.properties

Created a topic named "test":

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

Ran the Producer:

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test 
Test1
Test2

Listened by the Consumer:

bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
Test1
Test2

Instead of Standard input, I want to pass a data file to the Producer which can be seen directly by the Consumer. Or is there any kafka producer instead of console consumer using which I can read data files. Any help would really be appreciated. Thanks!

5条回答
Deceive 欺骗
2楼-- · 2019-03-11 22:08

If there is always a single file, you can just use tail command and then pipeline it to kafka console producer.

But if a new file will be created when some conditions met, you may need use apache.commons.io.monitor to monitor new file created, then repeat above.

查看更多
We Are One
3楼-- · 2019-03-11 22:16

You can read data file via cat and pipeline it to kafka-console-producer.sh.

cat ${datafile} | ${kafka_home}/bin/kafka-console-producer.sh --broker-list ${brokerlist} --topic test 
查看更多
孤傲高冷的网名
4楼-- · 2019-03-11 22:21
kafka-console-produce.sh \
  --broker-list localhost:9092 \
  --topic my_topic \
  --new-producer < my_file.txt

Follow this link: http://grokbase.com/t/kafka/users/157b71babg/kafka-producer-input-file

查看更多
疯言疯语
5楼-- · 2019-03-11 22:26

Kafka has this built-in File Stream Connector, for piping the content of a file to producer(file source), or directing file content to another destination(file sink).

We have bin/connect-standalone.sh to read from file which can be configured in config/connect-file-source.properties and config/connect-standalone.properties.

So the command will be:

bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties
查看更多
地球回转人心会变
6楼-- · 2019-03-11 22:27

You can probably try the kafkacat utility as well. The readme on Github provides examples

It would be great if you could share which tool worked the best for you :)

Details from KafkaCat Readme:

Read messages from stdin, produce to 'syslog' topic with snappy compression

$ tail -f /var/log/syslog | kafkacat -b mybroker -t syslog -z snappy
查看更多
登录 后发表回答