Hazelcast cluster over AWS using Docker

2019-06-08 04:54发布

问题:

Hi am trying to configure hazelcast cluster over AWS.

I am running hazelcast in docker container and using --net=host to use host network config.

when i look at hazelcast logs, I see

[172.17.0.1]:5701 [herald] [3.8] Established socket connection between /[node2]:5701 and /[node1]:47357
04:24:22.595 [hz._hzInstance_1_herald.IO.thread-out-0] DEBUG c.h.n.t.SocketWriterInitializerImpl - [172.17.0.1]:5701 [herald] [3.8] Initializing SocketWriter WriteHandler with Cluster Protocol
04:24:22.595 [hz._hzInstance_1_herald.IO.thread-in-0] WARN  c.h.nio.tcp.TcpIpConnectionManager - [172.17.0.1]:5701 [herald] [3.8] Wrong bind request from [172.17.0.1]:5701! This node is not requested endpoint: [node2]:5701
04:24:22.595 [hz._hzInstance_1_herald.IO.thread-in-0] INFO  c.hazelcast.nio.tcp.TcpIpConnection - [172.17.0.1]:5701 [herald] [3.8] Connection[id=40, /[node2]:5701->/[node1]:47357, endpoint=null, alive=false, type=MEMBER] closed. Reason: Wrong bind request from [172.17.0.1]:5701! This node is not requested endpoint: [node2]:5701

I can see error saying bind request is coming from 172.17.0.1 to node1, and node1 is not accepting this request.

        final Config config = new Config();
        config.setGroupConfig(clientConfig().getGroupConfig());
        final NetworkConfig networkConfig = new NetworkConfig();
        final JoinConfig joinConfig = new JoinConfig();
        final TcpIpConfig tcpIpConfig = new TcpIpConfig();
        final MulticastConfig multicastConfig = new MulticastConfig();
        multicastConfig.setEnabled(false);
        final AwsConfig awsConfig = new AwsConfig();
        awsConfig.setEnabled(true);
        // awsConfig.setSecurityGroupName("xxxx");
        awsConfig.setRegion("xxxx");
        awsConfig.setIamRole("xxxx");
        awsConfig.setTagKey("type");
        awsConfig.setTagValue("xxxx");
        awsConfig.setConnectionTimeoutSeconds(120);
        joinConfig.setAwsConfig(awsConfig);
        joinConfig.setMulticastConfig(multicastConfig);
        joinConfig.setTcpIpConfig(tcpIpConfig);
        networkConfig.setJoin(joinConfig);
        final InterfacesConfig interfaceConfig = networkConfig.getInterfaces();
        interfaceConfig.setEnabled(true).addInterface("172.29.238.71");
        config.setNetworkConfig(networkConfig);

above is the code to configure AWSConfig Please help me resolve this issue.

Thanks

回答1:

You are experiencing an issue (#11795) in default Hazelcast bind address selection mechanism.

There are several workarounds available:

Workaround 1: System property

You can set the bind address by providing correct IP address as a hazelcast.local.localAddress system property:

java -Dhazelcast.local.localAddress=[yourCorrectIpGoesHere]

or

System.setProperty("hazelcast.local.localAddress", "[yourCorrectIpGoesHere]")

Read details in System properties chapter of Hazelcast Reference Manual.

Workaround 2: Hazelcast Network configuration

Hazelcast Network configuration allows you to specify which IP addresses can be used to bind the server.

Declarative in hazelcast.xml:

<hazelcast>
  ...
  <network>
    ...
    <interfaces enabled="true">
      <interface>10.3.16.*</interface> 
      <interface>10.3.10.4-18</interface> 
      <interface>192.168.1.3</interface>         
    </interfaces>    
  </network>
  ...
</hazelcast>

Programmatic:

Config config = new Config();
NetworkConfig network = config.getNetworkConfig();
InterfacesConfig interfaceConfig = network.getInterfaces();
interfaceConfig.setEnabled(true).addInterface("192.168.1.3");
HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance(config);

Read details in Interfaces section of Hazelcast Reference Manual.

Update: With the earlier steps you are able to set a proper bind address - the local one returned by ip addr show for instance. Nevertheless, it could be insufficient if you run Hazelcast in an environment where local IP and public IP differs (clouds, docker).

Next Step: Configure public address

This step is necessary in environments, where cluster nodes doesn't see each other under the reported local address of the other node. You have to set the public address - it's the one which nodes are able to reach (optionally with port specified).

networkConfig.setPublicAddress("172.29.238.71");

// or if a non-default Hazelcast port is used - e.g.9991
networkConfig.setPublicAddress("172.29.238.71:9991");