Updating an existing Rdata file

2019-03-09 19:43发布

问题:

I have found myself in the position of needing to update one or two data objects in an Rdata file previously created using save. If I'm not careful to load the file I can forget to re-save some objects in the file. As an example, I'm working on a package with some objects stored in sysdata.rda (look-up tables for internal use which I do not want to export) and only want to worry about updating individual objects.

I haven't managed to work out if there is a standard way to do this, so created my own function.

resave <- function (..., list = character(), file = stop("'file' must be specified")) {
  # create a staging environment to load the existing R objects
  stage <- new.env()
  load(file, envir=stage)
  # get the list of objects to be "resaved"
  names <- as.character(substitute(list(...)))[-1L]
  list <- c(list, names)
  # copy the objects to the staging environment
  lapply(list, function(obj) assign(obj, get(obj), stage))
  # save everything in the staging environment
  save(list=ls(stage, all.names=TRUE), file=file)
}

It does seem like overkill though. Is there a better/easier way to do this?

As an aside, am I right in assuming that a new environment created in the scope of a function is destroyed after the function call?

回答1:

Here is a slightly shorter version:

resave <- function(..., list = character(), file) {
   previous  <- load(file)
   var.names <- c(list, as.character(substitute(list(...)))[-1L])
   for (var in var.names) assign(var, get(var, envir = parent.frame()))
   save(list = unique(c(previous, var.names)), file = file)
}

I took advantage of the fact the load function returns the name of the loaded variables, so I could use the function's environment instead of creating one. And when using get, I was careful to only look in the environment from which the function is called, i.e. parent.frame().

Here is a simulation:

x1 <- 1
x2 <- 2
x3 <- 3
save(x1, x2, x3, file = "abc.RData")

x1 <- 10
x2 <- 20
x3 <- 30
resave(x1, x3, file = "abc.RData")

load("abc.RData")
x1
# [1] 10
x2
# [1] 2
x3
# [1] 30


回答2:

I have added a refactored version of @flodel's answer in the stackoverflow package. It uses environments explicitly to be a bit more defensive.

resave <- function(..., list = character(), file) {
  e <- new.env()
  load(file, e)
  list <- union(list, as.character(substitute((...)))[-1L])
  copyEnv(parent.frame(), e, list)
  save(list = ls(e, all.names=TRUE), envir = e, file = file)
}


标签: r rdata