I've used the combination of R, doMPI, and foreach on a cluster for several years now and usually increasing simulation iteration numbers is pretty linear, in terms of execution time required. Recently I've been using this nested foreach loop, and as I increase the number of simulations (NumSim) the speed slows down dramatically, and I have no idea why. Any thoughts on how to diagnose, or where to start looking?
For example, as a testing example, with 10 cores and everything else held the same, if
NumSim = 10, time to complete is 678 seconds
NumSim = 20, time = 1856 seconds
NumSim = 30, time = 3560 seconds
NumSim = 50, time = 7956 seconds
With previous work I would have expected NumSim =50 to take almost exactly 678 * 5 ~ 3390 seconds.
results <- foreach (j = 1:NumSim, .combine = acomb) %:%
## Person Single Population
foreach (i = 1:PopSize, .combine=rbind, .packages = c("zoo")) %dopar% {
annual <- AnnualProbInf(WatCons, CrpPerLit, 1, 1, naf)
daily <- AnnualProbInf(WatCons, CrpPerLit, 365, 365, khf)
immune <- AnnualProbInfImm(WatCons, CrpPerLit, 730, 730, khf, DayNonSus)
out <- cbind (annual, daily, immune)
}