Reading the doc and SO questions, it seems that foreach
requires me to specify:
.packages character vector of packages that the tasks depend on.
.export character vector of variables to export. This can be useful when accessing a variable that isn't defined in the current environment.
However, the following code works even though each of my task depends on library(tree)
and variable formulas
. Why don't I have to specify .packages="tree", .export="formulas"
?
library(tree)
data(iris)
registerDoMC(2)
formulas <- c(as.formula("Species ~ Sepal.Length + Sepal.Width"),
as.formula("Species ~ Petal.Length + Petal.Width"))
Res <- foreach(i=(1:2)) %dopar% {
formula <- formulas[[i]]
grown_tree <- tree(formula, data=iris)
}
The doMC backend uses the mclapply function and mclapply forks its workers, so the workers inherit their environment from the current process. Therefore you don't have to use the
.packages
option to load packages that are already loaded or use the.export
option to export variables that are defined in the current environment. Backends such as doSNOW, doMPI and doRedis don't use fork, so foreach loops that work with doMC may not work on these backends.I think it's a good practice to use those options with doMC because it makes the code more portable, but as you've discovered, it isn't always necessary.