Ok this shouldn't be this hard, I'm trying to run 2 nodes in an elasticsearch cluster and getting an exception when trying to start node-1(node-2 which is master is already started). Using elasticsearch v 5.0.0 for both instances
Exception: failed to send join request to master, reason RemoteTransportException can't add node found existing node with the same id but is a different node instance]
Node-1 config:
node.name: SANNNNN-1
network.host: 10.3.185.250
discovery.zen.ping.unicast.hosts: ["10.3.185.251:9300"]
Node-2 config:
node.name: SAN-2
network.host: 10.3.185.251
discovery.zen.ping.unicast.hosts: ["10.3.185.251:9300"]
Full Exception on node 2:
[INFO ][o.e.d.z.ZenDiscovery ] [SANNNNN-1] failed to send join request to master [{SAN-2}{DxExoYHHTu2-rFvuvQSuEg}{OfYBe97HQCmcha63CFiYlQ}{10.3.185.251}{10.3.185.251:9300}], reason [RemoteTransportException[[SAN-2][10.3.185.251:9300][internal:discovery/zen/join]]; nested: IllegalArgumentException[can't add node {SANNNNN-1}{DxExoYHHTu2-rFvuvQSuEg}{hP4gLRugRgWzSuNnEhGHSw}{10.3.185.250}{10.3.185.250:9300}, found existing node {SAN-2}{DxExoYHHTu2-rFvuvQSuEg}{OfYBe97HQCmcha63CFiYlQ}{10.3.185.251}{10.3.185.251:9300} with the same id but is a different node instance]; ]
Ok so the issue was copying the elasticsearch folder from one node to another over scp. Elasticsearch saves the node id in elasticsearch/data/ folder. Deleted the data folder on one node and restarted it. The cluster is up and running. Hope this saves someone the hassle.
Remove the directory
<Elastic search home>/data
and restart the ES node, this issue is due to elastic search storing id in this directory, and this is a common mistake when copying one working elastic search directory from one node to another.after fixing the issue, check the cluster status like this:
curl -X GET "localhost:9200/_cluster/health"
works fine with
elastic search 6
as well