I am trying to get my head around creating a non-blocking piece of heavy computation in nodejs. Take this example (stripped out of other stuff):
http.createServer(function(req, res) {
console.log(req.url);
sleep(10000);
res.end('Hello World');
}).listen(8080, function() { console.log("ready"); });
As you can imagine, if I open 2 browser windows at the same time, the first will wait 10 seconds and the other will wait 20, as expected. So, armed with the knowledge that a callback is somehow asynchronous I removed the sleep and put this instead:
doHeavyStuff(function() {
res.end('Hello World');
});
with the function simply defined:
function doHeavyStuff(callback) {
sleep(10000);
callback();
}
that of course does not work... I have also tried to define an EventEmitter and register to it, but the main function of the Emitter has the sleep inside before emitting 'done', for example, so again everything will run block.
I am wondering here how other people wrote non-blocking code... for example the mongojs module, or the child_process.exec are non blocking, which means that somewhere down in the code either they fork a process on another thread and listen to its events. How can I replicate this in a metod that for example has a long process going?
Am I completely misunderstanding the nodejs paradigm? :/
Thanks!
Update: solution (sort of)
Thanks for the answer to Linus, indeed the only way is to spawn a child process, like for example another node script:
http.createServer(function(req, res) {
console.log(req.url);
var child = exec('node calculate.js', function (err, strout, strerr) {
console.log("fatto");
res.end(strout);
});
}).listen(8080, function() { console.log("ready"); });
The calculate.js can take its time to do what it needs and return. In this way, multiple requests will be run in parallel so to speak.
This is a classic misunderstanding of how the event loop is working.
This isn't something that is unique to node - if you have a long running computation in a browser, it will also block. The way to do this is to break the computation up into small chunks that yield execution to the event loop, allowing the JS environment to interleave with other competing calls, but there is only ever one thing happening at one time.
The
setImmediate
demo may be instructive, which you can find here.You can't do that directly, without using some of the IO modules in node (such as
fs
ornet
). If you need to do a long-running computation, I suggest you do that in a child process (e.g.child_process.fork
) or with a queue.If you computation can be split into chunks, you could schedule executor to poll for data every N seconds then after M seconds run again. Or spawn dedicated child for that task alone, so that the main thread wouldn't block.
We (Microsoft) just released napajs that can work with Node.js to enable multithreading JavaScript scenarios in the same process.
your code will then look like:
You can read this post for more details.