i have installed elasticsearch 2.2.3 and configured in cluster of 2 nodes
Node 1 (elasticsearch.yml)
cluster.name: my-cluster
node.name: node1
bootstrap.mlockall: true
discovery.zen.ping.unicast.hosts: ["ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com", "ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com"]
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
indices.fielddata.cache.size: "30%"
indices.cache.filter.size: "30%"
node.master: true
node.data: true
http.cors.enabled: true
script.inline: false
script.indexed: false
network.bind_host: 0.0.0.0
Node 2 (elasticsearch.yml)
cluster.name: my-cluster
node.name: node2
bootstrap.mlockall: true
discovery.zen.ping.unicast.hosts: ["ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com", "ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com"]
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
indices.fielddata.cache.size: "30%"
indices.cache.filter.size: "30%"
node.master: false
node.data: true
http.cors.enabled: true
script.inline: false
script.indexed: false
network.bind_host: 0.0.0.0
If i get curl -XGET 'http://localhost:9200/_cluster/state?pretty'
i have:
{
"error" : {
"root_cause" : [ {
"type" : "master_not_discovered_exception",
"reason" : null
} ],
"type" : "master_not_discovered_exception",
"reason" : null
},
"status" : 503
}
Into log of node 1 have:
[2016-06-22 13:33:56,167][INFO ][cluster.service ] [node1] new_master {node1}{Vwj4gI3STr6saeTxKkSqEw}{127.0.0.1}{127.0.0.1:9300}{master=true}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-06-22 13:33:56,210][INFO ][http ] [node1] publish_address {127.0.0.1:9200}, bound_addresses {[::]:9200}
[2016-06-22 13:33:56,210][INFO ][node ] [node1] started
[2016-06-22 13:33:56,221][INFO ][gateway ] [-node1] recovered [0] indices into cluster_state
Into log of node 2 instead:
[2016-06-22 13:34:38,419][INFO ][discovery.zen ] [node2] failed to send join request to master [{node1}{Vwj4gI3STr6saeTxKkSqEw}{127.0.0.1}{127.0.0.1:9300}{master=true}], reason [RemoteTransportException[[node2][127.0.0.1:9300][internal:discovery/zen/join]]; nested: IllegalStateException[Node [{node2}{_YUbBNx9RUuw854PKFe1CA}{127.0.0.1}{127.0.0.1:9300}{master=false}] not master for join request]; ]
Where the error?
I resolved with this line:
network.publish_host: ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com
every
elasticsearch.yml
config file must have this line with your hostnameIn my system firewall is on that's why i got same error when i turn off the firewall then every thing is working fine. So make sure that your firewall is off.
There's a lot of settings in here that you either don't want (like the fielddata one) or don't need. Also, you're clearly using AWS EC2 instances, so you should use the
cloud-aws
plugin (broken into separate plugins in ES 5.x). This will provide a new discovery model that you can take advantage of instead ofzen
.For each node, you'll want to therefore install the
cloud-aws
plugin (assuming ES 2.x):Once installed on each node, then you can use it to take advantage of the
discovery-ec2
component:Finally, your problem is that you are failing master election for some reason that most likely stems from connectivity issues. The above configuration should fix those issues, but you have one other critical issue: you are specifying the
discovery.zen.minimum_master_nodes
setting incorrectly. You have two eligible master nodes, but you are asking Elasticsearch to require only one for any election. That means, in isolation, each eligible master node can decide that they have a quorum, and therefore elect themselves separately (thus giving two masters and effectively two clusters). This is bad.You must therefore always set that setting using quorum:
(M / 2) + 1
, rounded down, whereM
is the number of master eligible nodes. So:If you had 3, 4, or 5 master eligible nodes, then it would be:
So, you should also be setting, in your case:
Note, you could add this either as another line or, you could modify the discovery block from above (it really comes down to style of YAML):
The root cause of
master not discovered
exception is the nodes are not able to ping each other on port 9300. And this needs to be both ways. i.e node1 should be able to ping node2 on 9300 and vice versa.A simple telnet would be able to confirm. From node1, fire
telnet node2 9300
.If it succeeds, next from node2 try
telnet node1 9300
.In case of
master not discovered
exception, at least one of the above telnet would be failing.In case you don't have telnet installed, you could even do a
curl
.Hope this helps.
If master starts with index made in old version of elastic, and worker start with empty index and init it with new version you can also have this error
Sandeep's answer above hinted to me that nodes aren't able to talk to each other. When I dug more into this, I found that I was missing inbound rule for TCP, port
9300
in EC2's security group. Added the rules and restartedelasticsearch
service on all nodes and it started working.