I have an api server running Node.js that was using it's cluster module and testing looked to be pretty good. Now our IT department wants to move to using Docker containers which I am happy about but I've never actually used it other than just playing around. But I had a thought, the Node.js app runs within a single Docker process so the cluster module wouldn't really be the best as the single Docker process can be a slow point of the setup until the request is split up within that process by the cluster module.
So really a cluster of Docker containers running being able to start and stop them on the fly is more important than using Node.js' cluster module correct?
If I have a cluster of containers, would using Node.js' cluster module get me anything? The api endpoints take less than .5sec to return (usually quite a bit less).
I'm using MySQL (believe it's a single server, nothing more currently) so there shouldn't be any reason to use a data integrity solution then.
What I've seen as the best solution when using Docker is to keep as fewer processes per container as possible since containers are lightweight; you don't want processes trying to use more than one CPU. So, running a cluster in the container won't add any value and might worse latency.
Here https://medium.com/@CodeAndBiscuits/understanding-nodejs-clustering-in-docker-land-64ce2306afef#.9x6j3b8vw Chad Robinson explains the idea in general terms.
Kubernetes, Rancher, Mesos and other container management layers handle the load-balancing. They provide "scheduling" (moving those Docker container slices around different CPUs and machines to get a good usage across the cluster) and "networking" (load balancing inbound requests to those containers) layers internally.
Update
I think it's worth adding the link Why it is recommended to run only one process in a container? where people share their ideas and experiences, but chiefly from Jon there are some interesting points:
Provided that you give a single responsibility (single process, function or concern) to a container: Good idea Docker names this 'concern' ;)
You'll have to measure to be sure, but my hunch would be running with node's cluster module would be worthwhile. It would get you more CPU utilization with the least amount of extra overhead. No extra containers to manage (start, stop, monitor). Plus the cluster workers have an efficient communication mechanism. The most reasonable evolution (don't skip steps) would seem to me: