Using variables for column functions in mutate()

2020-04-08 12:51发布

问题:

How can I use variables in place of column names in dplyr strings? As an example say I want to add a column to the iris dataset called sum that is the sum of Sepal.Length and Sepal.Width. In short I want a working version of the below code.

x = "Sepal.Length"
y = "Sepal.Width"
head(iris%>% mutate(sum = x+y))

Currently, running the code outputs "Evaluation error: non-numeric argument to binary operator" as R evaluates x and y as character vectors. How do I instead get R to evaluate x and y as column names of the dataframe? I know that the answer is to use some form of lazy evaluation, but I'm having trouble figuring out exactly how to configure it.

Note that the proposed duplicate: dplyr - mutate: use dynamic variable names does not address this issue. The duplicate answers this question:

Not my question: How do I do:

var = "sum"
head(iris %>% mutate(var = Sepal.Length + Sepal.Width))

回答1:

I think that recommended way is using sym:

iris %>% mutate(sum = !!sym(x) + !!sym(y)) %>% head


回答2:

It also works with get():

> rm(list = ls())
> data("iris")
> 
> library(dplyr)
> 
> x <- "Sepal.Length"
> y <- "Sepal.Width"
> 
> head(iris %>% mutate(sum = get(x) + get(y)))
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species sum
1          5.1         3.5          1.4         0.2  setosa 8.6
2          4.9         3.0          1.4         0.2  setosa 7.9
3          4.7         3.2          1.3         0.2  setosa 7.9
4          4.6         3.1          1.5         0.2  setosa 7.7
5          5.0         3.6          1.4         0.2  setosa 8.6
6          5.4         3.9          1.7         0.4  setosa 9.3


标签: r dplyr