I have two Virtual Machines that use internal IP addresses to speak to one another while the outside world knows about these VMs only via external IP addresses.
I have a distributed cache that makes use of the two VM's - each has an Erlang Node that must communicate with the other. I also have Erlang clients of the cash, on other machines, that need to communicate with one (or both) of the Erlang caching nodes on the VMs.
So, if I have the cache nodes named using the internal IP addresses then they can communicate with one another, but no other Erlang node can interact with them. But, if I name the cache nodes using the VM's external IP addresses, then the outside Erlang nodes can communicate with the cache nodes, but the cache nodes cannot communicate with one another.
Is there something I can do about this other than using an http or socket-based interface that does not rely on joining the nodes into a mesh?
Before answering the question, I want to point out, that Distributed Erlang is not secure and you should not use it outside of trusted local area network. (I had to leave that comment after I saw "external ip address", I assume it doesn't mean public). Below is a list of 4 important things, you should be aware of:
When you start a node, it is good to give it a name like this:
When you try to connect to that node from other machine, you can use something like this in erlang shell:
The important part is that node name:
'myname@192.168.0.1'
is an atom. It is not "name and ip" - it is one thing. Even, if your second node is on the same machine, you can't use:because it is different node name.
This means, that to connect two nodes, only one has to see the other.
Example: you have:
than:
In the second case, the connections is just as if they were on the same network. You only have to think about who should initialise the connection.
If you don't want to that - use hidden nodes.
But I think, you have that covered, if you were able connect other nodes.
The easiest solution is to use external ip addresses everywhere, because Erlang distribution was designed to run on local network. Harder solution involves making sure, that you connect from nodes in local network to nodes with external ip.
What you are trying to achieve is definitely doable.
Preliminaries
Erlang's distribution addresses are in two parts: the node name and the host name. They are separated by the
@
sign.Host names can be numeric IPv4 addresses. They can also be domain names. There are two distinct modes, where host names are short (single word, e.g.
vm1
) and where they are long (several words, e.g.vm1.domain.com
). IP addresses are long names. Nodes started in one mode (short or long) can only communicate with nodes started in the same mode. Nodes are also protected by a cookie: a node will only accept incoming connection with a matching cookie. The easiest is to start all nodes of a given cluster with the same cookie.When an Erlang node tries to connect to another Erlang node, it needs to find the IP address of the distant node. If it is the same as itself, it will simply try to connect on the local host. If it is different, it will try to resolve this host name to an IP address.
Then it will connect to the
epmd
daemon on this host to be told which port Erlang is running.epmd
as well as Erlang nodes listen on all interfaces (by default).Solution and example
Based on this mechanism, you could use either short or long names, but exploit the resolution mechanism. The easiest on Unix would be to configure different IPs on each
/etc/hosts
of your machines (especially on the two virtual machines) so they will connect to each other through their private addresses, while being accessed through their public addresses.Let's say that Virtual machine A (VM A) has private IP address 10.0.0.2 and public IP address 123.4.5.2 and VM B has private IP address 10.0.0.3 and public IP address 123.4.5.3. Let's also say that you decided to go for short names.
You could put on VM A this entry in
/etc/hosts
:You could put the matching entry on VM B's
/etc/hosts
:And on all the external clients, you could put:
You would start your nodes as follows:
You can avoid the
/etc/hosts
edits on client nodes if you have a domain name (e.g.yourdomain.com
) and you can getvma.yourdomain.com
to resolve to 123.4.5.2. You can also use a specific Erlang Inet configuration file.Security
Erlang distribution mechanism is not meant to be public facing. Besides, all communications will be unencrypted. I strongly suggest to configure firewalls on each host to only let connections from other cluster servers and use SSL distribution.
For the firewall: Erlang distribution uses port 4369 for
epmd
as well as random ports for each node. You can limit the range of these random ports by using Erlang kernel application environment settingsinet_dist_listen_min
andinet_dist_listen_max
. You will need to allow incoming TCP connections on these ports, but only from other hosts of the cluster.SSL distribution is quite complex to setup but well documented. The main drawback in your case is that all connections should be over SSL, including those between the two virtual machines on their private network, and local connections to open remote shells.