When I am using mclapply, from time to time (really randomly) it gives incorrect results. The problem is quite thoroughly described in other posts across the Internet, e.g. (http://r.789695.n4.nabble.com/Bug-in-mclapply-td4652743.html). However, no solution is provided. Does anyone know how to fix this problem? Thank you!
相关问题
- R - Quantstart: Testing Strategy on Multiple Equit
- Using predict with svyglm
- Reshape matrix by rows
- Extract P-Values from Dunnett Test into a Table by
- split data frame into two by column value [duplica
相关文章
- How to convert summary output to a data frame?
- How to plot smoother curves in R
- Paste all possible diagonals of an n*n matrix or d
- ess-rdired: I get this error “no ESS process is as
- How to use doMC under Windows or alternative paral
- dyLimit for limited time in Dygraphs
- Saving state of Shiny app to be restored later
- How to insert pictures into each individual bar in
I'm adding this answer so others hitting this question won't have to wade through the long thread of comments (I am the bounty granter but not the OP).
mclapply
initially populates the list it creates with NULLS. As the worker processes return values, these values overwrite the NULLS. If a process dies without ever returning a value,mclapply
will return a NULL.When memory becomes low, the Linux out of memory killer (oom killer)
https://lwn.net/Articles/317814/
will start silently killing processes. It does not print anything to the console to let you know what it's doing, although the oom killer activities show up in the system log. In this situation the output of
mclapply
will appear to have been randomly contaminated with NULLS.The problem reported by Winston Chang that you cite appears to have been fixed in R 2.15.3. There was a bug in
mccollect
that occurred when assigning the worker results to the result list:This fails if
unserialize(r)
returns a NULL, since assigning a NULL to a list in this way deletes the corresponding element of the list. This was changed in R 2.15.3 to:which is a safe way to assign an unknown value to a list.
So if you're using R <= 2.15.2, the solution is to upgrade to R >= 2.15.3. If you have a problem using R >= 2.15.3, then presumably it's a different problem then the one reported by Winston Chang.
I also read over the issues discussed in the R-help thread started by Elizabeth Purdom. Without a specific test case, my guess is that the problem is not due to a bug in mclapply because I can reproduce the same symptoms with the following function:
If a worker started by mclapply dies while executing a task for any reason (receiving a signal, seg faulting, exiting), mclapply will return a NULL for all of the tasks that were assigned to that worker:
In this case, NULL's were returned for tasks 1 and 3 due to prescheduling, even though only task 3 actually failed.
If a worker dies when using a function such as parLapply or clusterApply, an error is reported:
I've seen many such reports, and I think they tend to happen in large programs that use lots of packages that are hard to turn into reproducible test cases.
Of course, in this example, you'll also get an error when using lapply, although the error won't be hidden as it is with mclapply. If the problem doesn't seem to happen when using lapply, it may be because the problem rarely occurs, so it only happens in very large runs that are executed in parallel using mclapply. But it is also possible that the error occurs, not because the tasks are executed in parallel, but because they are executed by forked processes. For example, various graphics operations will fail when executed in a forked process.