Show special primitive functions in call stack

2019-02-13 19:59发布

问题:

This question prompted the following question: Is there a way to view the special primitive functions that are in the call stack?

For example, create a function that returns the call stack on exit:

myFun <- function(obj){
  on.exit(print(sys.calls()))
  return(obj)
}

Calling this function and assigning its result to an object using assign avoids using special primitive functions:

> assign("myObj",myFun(4))
[[1]]
assign("myObj", myFun(4))

[[2]]
myFun(4)

But using the assignment operator, this gets left out of the stack

> `<-`(myObj, myFun(6))
[[1]]
myFun(6)

Granted, it might not be all that common to want to see the assignment operator in the call stack, but other functions such as rep and log also get hidden

回答1:

I don't think there's any way to access calls to primitive functions via the call stack. Here is why.

When a "typical" R function is evaluated:

  1. The supplied arguments are matched to the formal arguments.
  2. A new environment (with a pointer to its enclosing environment) is created, and the formal arguments are assigned into it.
  3. The body of the function is evaluated in the newly created environment.

The chain of enclosing environments that is built up when function calls are nested within one another is the "call stack" or "frame stack" to which sys.calls(), sys.frames() and the like provide some access.

My strong suspicion is that calls to primitive functions don't appear on the call stack because no R-side environment is created during their evaluation. No environment is created, so no environment appears on the call stack.

For some more insight, here's how John Chambers describes the evaluation of primitive functions on page 464 of Software for Data Analysis:

Evaluation of a call to one of these functions starts off in the usual way, but when the evaluator discovers that the function object is a primitive rather than a function defined in R, it branches to an entirely different computation. The object only appears to be a function object with formal arguments and a call to the function .Primitive() with a string argument. In reality, it essentially contains only an index into a table that is part of the C code implementing the core of R. The entry of the table identifies a C routine in the core that is responsible for evaluating calls to this specific primitive. The evaluator will transfer control to that routine, and expects the routine to return a C-language pointer to the R object representing the value of the call.



回答2:

I don’t think Josh’s answer is correct.

Well, it would be correct if <- were on the call stack in your example. But it isn’t.

Small recap: normal R function evaluation treats arguments as promises that are evaluated lazily when accessed. This means that in the following call:

foo(bar(baz))

bar(baz) is evaluated inside foo (if at all). Consequently, if we inspect the call stack inside bar, like so:

bar = function (x) {
    sys.calls()
}

… then it looks as follows:

[[1]]
foo(bar(baz))

[[2]]
bar(baz)

Alas, as you noted, <- (and =) isn’t a normal function, it’s a primitive (BUILTINSXP). In fact, it’s defined in the R source as follows:

{"<-",      do_set,     1,  100,    -1, {PP_ASSIGN,  PREC_LEFT,   1}},

Take a look at the fourth argument: 100. The comment before this code explains what the digits mean. Here’s the relevant part, explaining the leftmost digit:

Z=1 says evaluate arguments before calling (BUILTINSXP)

This means that the following code the call to bar(baz) is evaluated before the assignment:

`<-`(x, bar(baz))

That’s why <- doesn’t appear in the list of sys.calls(): it isn’t a current call. It gets called after bar finishes evaluating.


There’s a way to work around this limitation: you can redefine <-/= in R code. If you do this, it behaves like a normal R function:

`<-` = function (lhs, rhs) {
    name = as.name(deparse(substitute(lhs), backtick = true))
    rhs # evaluate expression before passing it to `bquote`, for a cleaner call stack
    eval.parent(bquote(base::`<-`(.(name), .(rhs))))
}

However, beware that this will incur a non-negligible performance hit for every subsequent assignment within the scope where <- is redefined: in fact, it makes assignment roughly a factor 1000 (!!!) slower. This is not usually acceptable.



标签: r callstack