I have RabbitMQ cluster with two nodes in production and the cluster is breaking with these error messages:
=ERROR REPORT==== 23-Dec-2011::04:21:34 ===
** Node rabbit@rabbitmq02 not responding **
** Removing (timedout) connection **=INFO REPORT==== 23-Dec-2011::04:21:35 ===
node rabbit@rabbitmq02 lost 'rabbit'=ERROR REPORT==== 23-Dec-2011::04:21:49 ===
Mnesia(rabbit@rabbitmq01): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, rabbit@rabbitmq02}
I tried to simulate the problem by killing the connection between the two nodes using "tcpkill", the cluster has disconnected,and surprisingly the two nodes are not trying to reconnect !
When the cluster breaks, haproxy load balancer still marks both nodes as active and send request to both of them, although they are not in a cluster.
My questions:
If the nodes are configured to work as a cluster, when I get a network failure , why aren't they trying to reconnect after ?
How can I identify broken cluster and shutdown one of the nodes ? I have consistency problems when working with the two nodes separately.
RabbitMQ also offers two ways to deal with network partitions automatically: pause-minority mode and autoheal mode. (The default behaviour is referred to as ignore mode).
In pause-minority mode RabbitMQ will automatically pause cluster nodes which determine themselves to be in a minority (i.e. fewer or equal than half the total number of nodes) after seeing other nodes go down. It therefore chooses partition tolerance over availability from the CAP theorem. This ensures that in the event of a network partition, at most the nodes in a single partition will continue to run.
In autoheal mode RabbitMQ will automatically decide on a winning partition if a partition is deemed to have occurred. It will restart all nodes that are not in the winning partition. The winning partition is the one which has the most Automatically handling partitions clients connected (or if this produces a draw, the one with the most nodes; and if that still produces a draw then one of the partitions is chosen in an unspecified way).
You can enable either mode by setting the configuration parameter
cluster_partition_handling
for the rabbit application in your configuration file to eitherpause_minority
orautoheal
.Which mode should I pick?
It's important to understand that allowing RabbitMQ to deal with network partitions automatically does not make them less of a problem. Network partitions will always cause problems for RabbitMQ clusters; you just get some degree of choice over what kind of problems you get. As stated in the introduction, if you want to connect RabbitMQ clusters over generally unreliable links, you should use the
federation
plugin or theshovel
plugin.With that said, you might wish to pick a recovery mode as follows:
ignore: Your network really is reliable. All your nodes are in a rack, connected with a switch, and that switch is also the route to the outside world. You don't want to run any risk of any of your cluster shutting down if any other part of it fails (or you have a two node cluster).
pause_minority: Your network is maybe less reliable. You have clustered across 3 AZs in EC2, and you assume that only one AZ will fail at once. In that scenario you want the remaining two AZs to continue working and the nodes from the failed AZ to rejoin automatically and without fuss when the AZ comes back.
autoheal: Your network may not be reliable. You are more concerned with continuity of service than with data integrity. You may have a two node cluster.
This answer is ref from rabbitmq docs. https://www.rabbitmq.com/partitions.html will give you a more detailed description.
One other way to recover from this kind of failure is to work with Mnesia which is the database that RabbitMQ uses as the persistence mechanism and for the synchronization of the RabbitMQ instances (and the master / slave status) are controlled by this. For all the details, refer to the following URL: http://www.erlang.org/doc/apps/mnesia/Mnesia_chap7.html
Adding the relevant section here:
This is a more lengthy and involved way of recovering from such failures .. but will give better granularity and control over data that should be available in the final master node (this can reduce the amount of data loss that might happen when "merging" RabbitMQ masters).
RabbitMQ Clusters do not work well on unreliable networks (part of RabbitMQ documentation). So when the network failure happens (in a two node cluster) each node thinks that it is the master and the only node in the cluster. Two master nodes don't automatically reconnect, because their states are not automatically synchronized (even in case of a RabbitMQ slave - the actual message synchronization does not happen - the slave just "catches up" as messages get consumed from the queue and more messages get added).
To detect whether you have a broken cluster, run the command:
on each of the nodes that form part of the cluster. If the cluster is broken then you'll only see one node. Something like:
In such cases, you'll need to run the following set of commands on one of the nodes that formed part of the original cluster (so that it joins the other master node (say rabbitmq1) in the cluster as a slave):
Finally check the cluster status again .. this time you should see both the nodes.
Note: If you have the RabbitMQ nodes in an HA configuration using a Virtual IP (and the clients are connecting to RabbitMQ using this virtual IP), then the node that should be made the master should be the one that has the Virtual IP.