Why does as.formula only work inside lm() inside w

2019-08-11 16:23发布

Working with R, this is a real WTF:

R> f_string <- 'Sepal.Length ~ Sepal.Width'
R> l <- with(iris, lm(as.formula(f_string))) # works fine

R> f_formula <- as.formula(f_string)
R> l <- with(iris, lm(f_formula))
Error in eval(expr, envir, enclos) : object 'Sepal.Length' not found

Why does as.formula have to be inside the lm() call? I get it that this is a question about which environment things are evaluated in, because this works:

R> f_formula <- with(iris, as.formula(f_string))
R> lm(f_formula)

but I'm having real trouble wrapping my head around why one works and the other one doesn't.

1条回答
Juvenile、少年°
2楼-- · 2019-08-11 16:51

Your failing example fails because you are creating the formula with the global environment:

> f_formula <- as.formula(f_string)
> l <- with(iris, lm(f_formula))
Error in eval(expr, envir, enclos) : object 'Sepal.Length' not found
> str(f_formula)
Class 'formula' length 3 Sepal.Length ~ Sepal.Width
  ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 

and there's no Sepal.Length there. If you create the appropriate objects in the global environment you can make it work:

> Sepal.Length=1:10
> Sepal.Width=runif(10)
> l <- with(iris, lm(f_formula)) # "works" (ie doesn't error)

But that is completely ignoring the iris data. Welcome to the world of annoying R behaviour.

The other examples are all computing the formula object within the iris data frame as an environment. If you debug lm and take a look at what formula is in one of your working cases:

Browse[2]> str(formula)
Class 'formula' length 3 Sepal.Length ~ Sepal.Width
  ..- attr(*, ".Environment")=<environment: 0x9d590b4> 

you'll see the environment is no longer the global one. If you want to see what's in that environment, get it from the formula's attributes and list:

Browse[2]> e = attr(formula,".Environment")
Browse[2]> with(e,ls())
[1] "Petal.Length" "Petal.Width"  "Sepal.Length" "Sepal.Width"  "Species"     
查看更多
登录 后发表回答