I'm attempting to implement a Node module which uses cluster
. The problem is that the entire parent scope is forked alongside the intended cluster code. I discovered it while writing tests in Mocha for the module: the test suite will run many times, instead of once.
See below, myModule.js creates N workers, one for each CPU. These workers are http servers, or could be anything else.
Each time the test.js runs, the script runs N + 1 times. In the example below, console.log runs 5 times on my quad core.
Can someone explain if this is an implementation issue or cluster config issue? Is there any way to limit the scope of fork() without having to import a module ( as in this solution https://github.com/mochajs/mocha/issues/826 )?
/// myModule.js ////////////////////////////////////
var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;
var startCluster = function(){
if (cluster.isMaster) {
// CREATE A CLUSTER OF FORKED WORKERS, ONE PER CPU
//master does not listen to UDP messages.
for (var i = 0; i < numCPUs; i++) {
var worker = cluster.fork();
}
} else {
// Worker processes have an http server.
http.Server(function (req, res){
res.writeHead(200);
res.end('hello world\n');
}).listen(8000);
}
return
}
module.exports = startCluster;
/////////////////////////////////////////////////
//// test.js ////////////////////////////////////
var startCluster = require('./myModule.js')
startCluster()
console.log('hello');
////////////////////////////////////////////////////////
So I'll venture an answer. Looking closer at the node docs there is a cluster.setupMaster
which can override defaults. The default on a cluster.fork()
is to execute the current script, with "file path to worker file. (Default=process.argv[1])"
https://nodejs.org/docs/latest/api/cluster.html#cluster_cluster_settings
So if another module is importing a script with a cluster.fork() call, it will still use the path of process.argv[1], which may not be the path you expect, and have unintended consequences.
So we shouldn't initialize the cluster master and worker in the same file as the official docs suggest. It would be prudent to separate the worker into a new file and override the default settings. (Also for safety you can add the directory path with __dirname ).
cluster.setupMaster({ exec: __dirname + '/worker.js',});
So here would be the corrected implementation:
/// myModule.js ////////////////////////////////////
var cluster = require('cluster');
var numCPUs = require('os').cpus().length;
var startCluster = function(){
cluster.setupMaster({
exec: __dirname + '/worker.js'
});
if (cluster.isMaster) {
// CREATE A CLUSTER OF FORKED WORKERS, ONE PER CPU
for (var i = 0; i < numCPUs; i++) {
var worker = cluster.fork();
}
}
return
}
module.exports = startCluster;
/////////////////////////////////////////////////
//// worker.js ////////////////////////////////////
var http = require('http');
// All worker processes have an http server.
http.Server(function (req, res){
res.writeHead(200);
res.end('hello world\n');
}).listen(8000);
////////////////////////////////////////////////////////
//// test.js ////////////////////////////////////
var startCluster = require('./myModule.js')
startCluster()
console.log('hello');
////////////////////////////////////////////////////////
You should only see 'hello' once instead of 1 * Number of CPUs
You need to have the "isMaster" stuff at the top of your code, not inside the function. The worker will run from the top of the module ( it's not like a C++ fork, where the worker starts at the fork() point ).
I assume that you want the startCluster = require('./cluster-bug.js') to eval only once? Well, thats because your whole script runs clustered. What you do specify inside startCluster is only to make it vary between master and slave clusters. Cluster spawns fork of file in which it is initialised.