Running multiple R scripts using the system(&#

2019-04-14 22:28发布

问题:

I'm running RStudio in Windows 7. I have written a master script that generates 57 new R scripts, each with commands to run a function based on two parameters:

vector1 <- c(1:19)
vector2 <- c(1:3)

First, the master script uses two for-loops (one using the index 'abc' for vector1, one using the index 'def' for vector2) to generate each of the 57 scripts in my working directory that take the following filename convention:

run_inference_<<vector1[abc]>>_<<vector2[def]>>.R

That part runs successfully - each of the 57 scripts is generated with the correct commands inside. My working directory now contains files run_inference_1_1.R, run_inference_1_2.R, etc.

The final thing I want to do is then run all 57 scripts from my master, and simultaneously. I've tried the following inside the for-loop:

system(paste0("Rscript run_inference_",abc, "_", def, ".R"),wait = F)

This does not work. However, if I open one of the 57 generated scripts and run it then I get the desired result from that script. This tells me the issue is within the system() command that I've written.

Each of the 57 scripts will not be computationally intensive (yet), and the test I want to do now should take 2 minutes on my PC. How can I edit my system() command to execute all 57 scripts simultaneously, please?

回答1:

You don't do this by calling system once with a big script, unless the program you're running knows how to parallelise the script itself. You do this by calling system multiple times from different R processes.

scripts <- paste0("Rscript run_inference_", abc, "_", def, ".R")

# make lots of R processes, assuming the script to be called won't eat CPU
cl <- parallel::makeCluster(30)

parallel::parLapply(cl, scripts, function(script) system(script))
parallel::stopCluster(cl)