How can I export the global environment for the beginning of each parallel simulation in foreach? The following code is part of a function that is called to run the simulations.
num.cores <- detectCores()-1
cluztrr <- makeCluster(num.cores)
registerDoParallel(cl = cluztrr)
sim.result.list <- foreach(r = 1:simulations,
.combine = list,
.multicombine = TRUE,
) %dopar% {
#...tons of calculations using many variables...
list(vals1,
vals2,
vals3)
}
stopCluster(cluztrr)
Is it necessary to use .export with a character vector of every variable and function that I use? Would that be slow in execution time?
If the foreach loop is in the global environment, variables should be exported automatically. If not, you can use .export = ls(globalenv())
(or .GlobalEnv
).
For functions from other packages, you just need to use the syntax package::function
.
The "If [...] in the global environment, ..." part of F. Privé reply is very important here. The foreach framework will only identify global variables in that case. It will not do so if the foreach()
call is done within a function.
However, if you use the doFuture backend (disclaimer: I'm the author);
library("doFuture")
registerDoFuture()
plan(cluster, workers = cl)
global variables that are needed will be automatically identified and exported (which is then done by the future framework and not the foreach framework). Now, if you rely on this, and don't explicitly specify .export
, then your code will only work with doFuture
and none of the other backends. That's a decision you need to make as a developer.
Also, automatic exporting of globals is neat, but be careful that you know how much is exported; exporting too many too large objects can be quite costly and introduce lots of overhead in your parallel code.