Can Mesos 'master' and 'slave' nod

2020-02-16 20:34发布

问题:

Can Apache Mesos 'master' nodes be co-located on the same machine as Mesos 'slave' nodes? Similarly (for high-availability (HA) deploys), can the Apache Zookeeper nodes used in Mesos 'master' election be deployed on the same machines as Mesos 'slave' nodes?

Mesos recommends 3 'masters' be used for HA deploys, and Zookeeper recommends 5 nodes be used for its quorum election system. It would be nice to have these services running along side Mesos 'slave' processes instead of committing 8 machines to effectively 'non-productive' tasks.

If such a setup is feasible, what are the pros/cons of such a setup?

Thanks!

回答1:

You can definitely run a master, slave, and zk process all on the same node. You can even run multiple master and slave processes on the same node, provided you give them each unique ports, but that's only useful for a test cluster.

Typically we recommend running ZK on the same nodes as your masters, but if you have extra ZKs, you can certainly run them on slaves, or mix-and-match as you see fit, as long as all master/slave/framework nodes can reach the ZK nodes, and all slaves can reach the masters.

For a smaller cluster (<10 nodes) it could make sense to run a slave process on each master, especially since the standby masters won't be doing much. Even an active master for a small cluster uses only a small amount of cpu, memory, and network resources. Just make sure you adjust the --resources on that slave to account for the master's resource usage.

Once your cluster grows larger (especially >100 nodes) the network traffic to/from the master as well as its cpu/memory utilization becomes significant enough that you wouldn't want to run a mesos slave on the same node as the master. It should be fine to co-locate ZK with your master even at large scale.

You didn't specifically ask, but I'll also discuss where to run your framework schedulers (e.g. Spark, Marathon, or Chronos). These could be co-located with any of the other components, but they only really need to be able to reach the master and zk nodes, since all communication to slaves goes through the master. Some customers run the schedulers on master nodes, some run them on edge nodes (so users don't have access to the slaves), and others use meta-frameworks like Marathon to run other schedulers on slaves as Mesos tasks.