How to make full replication in kafka?

2020-05-08 07:12发布

问题:

How to make full replication in kafka?

I have two servers, a leader and a follower.

How to make sure that when the leader refuses (turns off), all messages that are sent to the follower also appear on the leader after turning it on.

I know one option with launching: Kafka has a built-in bin/kafka-mirror-maker.sh synchronization program. It should always be run on the leader, then messages that go to it will also go to the follower. When the leader turns off, this program should start on the follower, and all messages, as I understand it, will go to him. After the leader is turned on, and after synchronization (that is, at the moment when the messages begin to go only to the leader), this service should also start on the leader and turn off on the follower, then the messages will always be synchronized.

If you keep these services on both servers at the same time, the messages will be endlessly duplicated. That is, one message will constantly come to both the follower and the leader due to synchronization.

But I'm not sure that this method is correct and it requires additional resources: a service for tracking all this and running bin/kafka-mirror-maker.sh.

 How can I do it right and without wasting resources?

回答1:

Kafka itself is a distributed system. Per the docs:

Kafka replicates the log for each topic's partitions across a configurable number of servers (you can set this replication factor on a topic-by-topic basis). This allows automatic failover to these replicas when a server in the cluster fails so messages remain available in the presence of failures.

If you want to replicate between Kafka clusters (such as full datacenters, or clusters serving different purposes) then this is where something like MirrorMaker would come in.



回答2:

How to make sure that when the leader refuses (turns off), all messages that are sent to the follower also appear on the leader after turning it on

This is built into the protocol, but that assumes every topic you are using has replication-factor=2


Sounds like you have only two brokers on the same network, so you do not need MirrorMaker, as the docs show it clearly is between two different, regional datacenters.

I would like to add, if you did want to do that, don't use kafka-mirror-maker. It is not as fault-tolerant and scalable as you might expect.

Instead, use MirrorMaker 2, as part of the apache-kafka-connect framework.