I am using Docker 17.04.0-ce and Compose 1.12.0 in Windows in order to deploy Elasticsearch cluster (version 5.4.0) over Docker. So far I have done the following:
1) I have created a single Elasticsearch node via compose with the following configuration
elasticsearch1:
build: elasticsearch/
container_name: es_1
cap_add:
- IPC_LOCK
environment:
- cluster.name=cp-es-cluster
- node.name=cloud1
- node.master=true
- http.cors.enabled=true
- http.cors.allow-origin="*"
- bootstrap.memory_lock=true
- discovery.zen.minimum_master_nodes=1
- xpack.security.enabled=false
- xpack.monitoring.enabled=false
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- esdata1:/usr/share/elasticsearch/data
ports:
- 9200:9200
- 9300:9300
networks:
docker_elk:
aliases:
- elasticsearch
This results in the node being deployed, but it is not accessible from Spark. I write data as
JavaEsSparkSQL.saveToEs(aggregators.toDF(), collectionName +"/record");
and I get the following error, although the node is running
I/O exception (java.net.ConnectException) caught when processing request: Connection timed out: connect
2) I found out that this problem is solved if I add the following line in the node configuration
- network.publish_host=${ENV_IP}
3) Then I created similar configurations for 2 additional nodes as
elasticsearch1:
build: elasticsearch/
container_name: es_1
cap_add:
- IPC_LOCK
environment:
- cluster.name=cp-es-cluster
- node.name=cloud1
- node.master=true
- http.cors.enabled=true
- http.cors.allow-origin="*"
- bootstrap.memory_lock=true
- discovery.zen.minimum_master_nodes=1
- xpack.security.enabled=false
- xpack.monitoring.enabled=false
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
- network.publish_host=${ENV_IP}
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- esdata1:/usr/share/elasticsearch/data
ports:
- 9200:9200
- 9300:9300
networks:
docker_elk:
aliases:
- elasticsearch
elasticsearch2:
build: elasticsearch/
container_name: es_2
cap_add:
- IPC_LOCK
environment:
- cluster.name=cp-es-cluster
- node.name=cloud2
- http.cors.enabled=true
- http.cors.allow-origin="*"
- bootstrap.memory_lock=true
- discovery.zen.minimum_master_nodes=2
- xpack.security.enabled=false
- xpack.monitoring.enabled=false
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- "discovery.zen.ping.unicast.hosts=elasticsearch1"
- node.master=false
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- esdata2:/usr/share/elasticsearch/data
ports:
- 9201:9200
- 9301:9300
networks:
- docker_elk
elasticsearch3:
build: elasticsearch/
container_name: es_3
cap_add:
- IPC_LOCK
environment:
- cluster.name=cp-es-cluster
- node.name=cloud3
- http.cors.enabled=true
- http.cors.allow-origin="*"
- bootstrap.memory_lock=true
- discovery.zen.minimum_master_nodes=2
- xpack.security.enabled=false
- xpack.monitoring.enabled=false
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- "discovery.zen.ping.unicast.hosts=elasticsearch1"
- node.master=false
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- esdata3:/usr/share/elasticsearch/data
ports:
- 9202:9200
- 9302:9300
networks:
- docker_elk
This results in a cluster of 3 nodes being created successfully. However the same error reappeared in Spark and data cannot be written to the cluster. I get the same behavior even if I add network.publish_host
to all nodes.
Regarding Spark, I use elasticsearch-spark-20_2.11 version 5.4.0 (same as ES version). Any ideas how to solve this issue?