do parallel combine progress bar and process

2019-07-30 04:39发布

问题:

I'm having issues to combine the process that I want to run in parallel and the creation of the progress bar.

My code for the process is:

pred_pnn <- function(x, nn){
  xlst <- split(x, 1:nrow(x))
  pred <- foreach(i = xlst,.packages = c('tcltk', 'foreach'), .combine = rbind) 
  %dopar% 
{ mypb <- tkProgressBar(title = "R progress bar", label = "",
                        min = 0, max = max(jSeq), initial = 0, width = 300)
  foreach(j = jSeq) %do% {Sys.sleep(.1)
  setTkProgressBar(mypb, j, title = "pb", label = NULL)
  }  
  library(pnn)
 data.frame(prob = guess(nn, as.matrix(i))$probabilities[1], row.names = NULL)
}
}

I combined my code and the one that comes form here

but didn't compile. I get a syntax error, but I can't find it.

I tried this other code:

pred_pnn <- function(x, nn){
  xlst <- split(x, 1:nrow(x))
  pred <- foreach(i = xlst, .combine = rbind) %dopar% 
{library(pnn)
 cat(i, '\n')
 data.frame(prob = guess(nn, as.matrix(i))$probabilities[1], row.names = NULL)
}
}

But I get an error too.

回答1:

The approach that you're trying to use might work under certain circumstances, but it isn't a good general solution. What I would want to do is to create a progress bar in the master process (outside of the foreach loop) and then have foreach update that progress bar as tasks are returned. Unfortunately, none of the backends support that. It's possible to do that using combine function tricks, but only if you're using a backend that supports calling the combine function on-the-fly, which doParallel, doSNOW and doMC do not. Those backends don't call combine on the fly because they are implemented using functions such as clusterApplyLB and mclapply which don't support a hook to allow user supplied code to be executed when tasks are returned.

Because I've seen interest in progress bar support in foreach, I modified the doSNOW package to add support for a doSNOW-specific "progress" option, and I checked the code into the R-Forge website. It makes use of some lower level functions in the snow package which unfortunately are not exported by the parallel package.

If you want to try out this new feature, you will need to install doSNOW from R-Forge. I did this on my MacBook Pro using the command:

install.packages("doSNOW", repos="http://R-Forge.R-project.org", type="source")

Here is a simple example script that demonstrates the experimental "progess" option:

library(doSNOW)
library(tcltk)
cl <- makeSOCKcluster(3)
registerDoSNOW(cl)
pb <- tkProgressBar(max=100)
progress <- function(n) setTkProgressBar(pb, n)
opts <- list(progress=progress)
r <- foreach(i=1:100, .options.snow=opts) %dopar% {
  Sys.sleep(1)
  sqrt(i)
}

Update

The progress option is now available in the latest version of doSNOW on CRAN.