is there another way of loading extra packages in

2019-02-27 11:56发布

问题:

One way of parallelization in R is through the snowfall package. To send custom functions to workers you can use sfExport() (see Joris' post here).

I have a custom function that depends on functions from non-base packages that are not loaded automagically. Thus, when I run my function in parallel, R craps out because certain functions are not available (think of the packages spatstat, splancs, sp...). So far I've solved this by calling library() in my custom function. This loads the packages on the first run and possibly just ignores on subsequent iterations. Still, I was wondering if there's another way of telling each worker to load the package on first iteration and be done with it (Or am I missing something and each iteration starts as a tabula rasa?).

回答1:

There's a specific command for that in snowfall, sfLibrary(). See also ?"snowfall-tools". Calling library manually on every node is strongly discouraged. sfLibrary is basically a wrapper around the solution Dirk gave based on the snow package.



回答2:

I don't understand the question.

Packages are loaded via library(), and most of the parallel execution functions support that. For example, the snow package uses

clusterEvalQ(cl, library(boot))

to 'quietly' (ie not return value) evaluate the given expression---here a call to library()---on each node. Most of the parallel execution frameworks have something like that.

Why again would you need something different, and what exactly does not work here?