hopefully I can describe what I'm looking for clearly enough. Working with Node and Python.
I'm trying to run a number of child processes (.py scripts, using child_process.exec()) in parallel, but no more than a specified number at a time (say, 2). I receive an unknown number of requests in batches (say this batch has 3 requests). I'd like to stop spawning processes until one of the current ones finishes.
for (var i = 0; i < requests.length; i++) {
//code that would ideally block execution for a moment
while (active_pids.length == max_threads){
console.log("Waiting for more threads...");
sleep(100)
continue
};
//code that needs to run if threads are available
active_pids.push(i);
cp.exec('python python-test.py '+ requests[i],function(err, stdout){
console.log("Data processed for: " + stdout);
active_pids.shift();
if (err != null){
console.log(err);
}
});
}
I know that while loop doesn't work, it was the first attempt.
I'm guessing there's a way to do this with
setTimeout(someSpawningFunction(){
if (active_pids.length == max_threads){
return
} else {
//spawn process?
}
},100)
But I can't quite wrap my head around it.
Or maybe
waitpid(-1)
Inserted in the for loop above in an if statement in place of the while loop? However I can't get the waitpid() module to install at the moment.
And yes, I understand that blocking execution is considered very bad in JS, but in my case, I need it to happen. I'd rather avoid external cluster manager-type libraries if possible.
Thanks for any help.
EDIT/Partial Solution
An ugly hack would be to use the answer from: this SO question (execSync()). But that would block the loop until the LAST child finished. That's my plan so far, but not ideal.