knitr: starting a fresh R session to clear RAM

2020-03-29 13:39发布

问题:

I sometimes work with lots of objects and it would be nice to have a fresh start because of memory issues between chunks. Consider the following example:

warning: I have 8GB of RAM. If you don't have much, this might eat it all up.

<<chunk1>>=
a <- 1:200000000
@
<<chunk2>>=
b <- 1:200000000
@
<<chunk3>>=
c <- 1:200000000
@

The solution in this case is:

<<chunk1>>=
a <- 1:200000000
@
<<chunk2>>=
rm(a)
gc()
b <- 1:200000000
@
<<chunk3>>=
rm(b)
gc()
c <- 1:200000000
@

However, in my example (which I can post because it relies on a large dataset), even after I remove all of the objects and run gc(), R does not clear all of the memory (only some). The reason is found in ?gc:

However, it can be useful to call ‘gc’ after a large object has
been removed, as this may prompt R to return memory to the
operating system.

Note the important word may. R has a lot of situations where it specifies may like this and so it is not a bug.

Is there a chunk option according to which I can have knitr start a new R session?

回答1:

My recommendation would to create an individual .Rnw for each of the major tasks, knit them to .tex files and then use \include or \input in a parent.Rnw file to build the full project. Control the building of the project via a makefile.

However, to address this specific question, using a fresh R session for each chunk, you could use the R package subprocess to spawn a R session, run the needed code, extract the results, and then kill the spawned session.

A simple example .Rnw file

\documentclass{article}
\usepackage{fullpage}
\begin{document}

<<include = FALSE>>=
knitr::opts_chunk$set(collapse = FALSE)
@

<<>>=
library(subprocess)

# define a function to identify the R binary
R_binary <- function () {
  R_exe <- ifelse (tolower(.Platform$OS.type) == "windows", "R.exe", "R")
  return(file.path(R.home("bin"), R_exe))
}
@


<<>>=
# Start a subprocess running vanilla R.
subR <- subprocess::spawn_process(R_binary(), c("--vanilla --quiet"))
Sys.sleep(2) # wait for the process to spawn

# write to the process
subprocess::process_write(subR, "y <- rnorm(100, mean = 2)\n")
subprocess::process_write(subR,  "summary(y)\n")

# read from the process
subprocess::process_read(subR, PIPE_STDOUT)

# kill the process before moving on.
subprocess::process_kill(subR)
@


<<>>=
print(sessionInfo(), local = FALSE)
@

\end{document}

Generates the following pdf:



标签: r knitr