r functions calling lm with subsets

2019-02-25 10:42发布

问题:

I was working on some code and I noticed something peculiar. When I run LM on a subset of some panel data I have it works fine, something like this:

library('plm')
data(Cigar)
lm(log(price) ~ log(pop) + log(ndi), data=Cigar, subset=Cigar$state==1)

Call:
lm(formula = log(price) ~ log(pop) + log(ndi), data = Cigar, 
subset = Cigar$state == 1)


Coefficients:
(Intercept)     log(pop)     log(ndi)  
  -26.4919       3.2749       0.4265  

but when I try to wrap this in a function I get:

myfunction <- function(formula, data, subset){
  return(lm(formula, data, subset))
}

myfunction(formula = log(price) ~ log(pop) + log(ndi), data = Cigar, subset = Cigar$state==1)

Error in xj[i] : invalid subscript type 'closure'

I really don't understand what's going on here, but it's breaking some other code I've written so I would like to know.

回答1:

The problem doesn't seem to be with the subset. I get the same error from your function when I change to subset = (state == 1). The arguments to your function aren't being passed and evaluated correctly.

I think you'd be better off using do.call

myfunction <- function(formula, data, subset) {
    do.call("lm", as.list(match.call()[-1]))
}

myfunction(log(price) ~ log(pop) + log(ndi), Cigar, state == 1)    
# Call:
# lm(formula = log(price) ~ log(pop) + log(ndi), data = Cigar, 
#     subset = state == 1)
#
# Coefficients:
# (Intercept)     log(pop)     log(ndi)  
#    -26.4919       3.2749       0.4265  


回答2:

You are most likely running into problems with Non-Standard evaluation (the lm function uses non-standard evaluation). Functions that use non-standard evaluation are convenient at the command line, but can cause problems when called from within other functions.

Some additional reading on the topic include Standard Nonstandard Evaluation Rules and this chapter in Advanced R