After I have played around for some time using R's parallel package on my Debian-based machine I still can't find a way to remove all zombie child-processes after a computation.
I'm searching for a general and OS independent solution.
Below a simple script illustrating the problem for 2 cores:
library(parallel)
testfun <- function(){TRUE}
cltype <- ifelse(.Platform$OS.type != "windows", "FORK", "PSOCK")
cl <- makeCluster(2, type = cltype)
p <- clusterCall(cl, testfun)
stopCluster(cl)
Unfortunately, this script leaves two zombie processes in the process table which only get killed if R is shut down.
This only seems to be an issue with "FORK" clusters. If you make a "PSOCK" cluster instead, the processes will die when you call
stopCluster(cl)
.Is there anything preventing you from using a "PSOCK" cluster on your Debian-based machine?
Probably the answer of your problem is in the help file of
makeCluster()
command.At the bottom of the file, it is written : It is good practice to shut down the workers by calling stopCluster: however the workers will terminate themselves once the socket on which they are listening for commands becomes unavailable, which it should if the master R session is completed (or its process dies).
The solution is (it is working for me) to define a port for your cluster while you are creating it.
another (may be not usefull) solution is setting a timeout for your sockets. timeout variable is in seconds.
In any case, the aim should be to make the socket connection unavailable.either closing the ports or closing the main R process would do this.
Edit: What I meant was to close the ports which the process is listening. It should be OS independent. you can try to use ->
showConnections(all = TRUE);
. This will give all the connections. Then you can trycloseAllConnections();
Sorry if this doesn't work also.