Passing an entire package to a snow cluster

2019-04-07 06:54发布

I'm trying to parallelize (using snow::parLapply) some code that depends on a package (ie, a package other than snow). Objects referenced in the function called by parLapply must be explicitly passed to the cluster using clusterExport. Is there any way to pass an entire package to the cluster rather than having to explicitly name every function (including a package's internal functions called by user functions!) in clusterExport?

1条回答
干净又极端
2楼-- · 2019-04-07 07:21

Install the package on all nodes, and have your code call library(thePackageYouUse) on all nodes via one the available commands, egg something like

 clusterApply(cl, library(thePackageYouUse))

I think the parallel package which comes with recent R releases has examples -- see for example here from help(clusterApply) where the boot package is loaded everywhere:

 ## A bootstrapping example, which can be done in many ways:
 clusterEvalQ(cl, {
   ## set up each worker.  Could also use clusterExport()
   library(boot)
   cd4.rg <- function(data, mle) MASS::mvrnorm(nrow(data), mle$m, mle$v)
   cd4.mle <- list(m = colMeans(cd4), v = var(cd4))
   NULL
 })
查看更多
登录 后发表回答