How does local() differ from other approaches to c

2019-03-18 00:44发布

问题:

Yesterday I learned from Bill Venables how local() can help create static functions and variables, e.g.,

example <- local({
  hidden.x <- "You can't see me!"
  hidden.fn <- function(){
    cat("\"hidden.fn()\"")
  }
  function(){
    cat("You can see and call example()\n")
    cat("but you can't see hidden.x\n")
    cat("and you can't call ")
    hidden.fn()
    cat("\n")
  }
})

which behaves as follows from the command prompt:

> ls()
[1] "example"
> example()
You can see and call example()
but you can't see hidden.x
and you can't call "hidden.fn()"
> hidden.x                 
Error: object 'hidden.x' not found
> hidden.fn()
Error: could not find function "hidden.fn"

I've seen this kind of thing discussed in Static Variables in R where a different approach was employed.

What the pros and cons of these two methods?

回答1:

Encapsulation

The advantage of this style of programming is that the hidden objects won't likely be overwritten by anything else so you can be more confident that they contain what you think. They won't be used by mistake since they can't readily be accessed. In the linked-to post in the question there is a global variable, count, which could be accessed and overwritten from anywhere so if we are debugging code and looking at count and see its changed we cannnot really be sure what part of the code has changed it. In contrast, in the example code of the question we have greater assurance that no other part of the code is involved.

Note that we actually can access the hidden function although its not that easy:

# run hidden.fn
environment(example)$hidden.fn()

Object Oriented Programming

Also note that this is very close to object oriented programming where example and hidden.fn are methods and hidden.x is a property. We could do it like this to make it explicit:

library(proto)
p <- proto(x = "x", 
  fn = function(.) cat(' "fn()"\n '),
  example = function(.) .$fn()
)
p$example() # prints "fn()"

proto does not hide x and fn but its not that easy to access them by mistake since you must use p$x and p$fn() to access them which is not really that different than being able to write e <- environment(example); e$hidden.fn()

EDIT:

The object oriented approach does add the possibility of inheritance, e.g. one could define a child of p which acts like p except that it overrides fn.

ch <- p$proto(fn = function(.) cat("Hello from ch\n")) # child
ch$example() # prints: Hello from ch


回答2:

local() can implement a singleton pattern -- e.g., the snow package uses this to track the single Rmpi instance that the user might create.

getMPIcluster <- NULL
setMPIcluster <- NULL
local({
    cl <- NULL
    getMPIcluster <<- function() cl
    setMPIcluster <<- function(new) cl <<- new
})

local() might also be used to manage memory in a script, e.g., allocating large intermediate objects required to create a final object on the last line of the clause. The large intermediate objects are available for garbage collection when local returns.

Using a function to create a closure is a factory pattern -- the bank account example in the Introduction To R documentation, where each time open.account is invoked, a new account is created.

As @otsaw mentions, memoization might be implemented using local, e.g., to cache web sites in a crawler

library(XML)
crawler <- local({
    seen <- new.env(parent=emptyenv())
    .do_crawl <- function(url, base, pattern) {
        if (!exists(url, seen)) {
            message(url)
            xml <- htmlTreeParse(url, useInternal=TRUE)
            hrefs <- unlist(getNodeSet(xml, "//a/@href"))
            urls <-
                sprintf("%s%s", base, grep(pattern, hrefs, value=TRUE))
            seen[[url]] <- length(urls)
            for (url in urls)
                .do_crawl(url, base, pattern)
        }
    }
    .do_report <- function(url) {
        urls <- as.list(seen)
        data.frame(Url=names(urls), Links=unlist(unname(urls)),
                   stringsAsFactors=FALSE)
    }
    list(crawl=function(base, pattern="^/.*html$") {
        .do_crawl(base, base, pattern)
    }, report=.do_report)
})

crawler$crawl(favorite_url)
dim(crawler$report())

(the usual example of memoization, Fibonacci numbers, is not satisfying -- the range of numbers that don't overflow R's numeric representation is small , so one would probably use a look-up table of efficiently pre-calculated values). Interesting how crawler here is a singleton; could as easily have followed a factory pattern, so one crawler per base URL.