Using dplyr group_by in a function

2019-05-12 20:45发布

问题:

I am trying to use dplyr's group_by in a local function, example:

testFunction <- function(df, x) {
  df %>%
group_by(x) %>%
summarize(mean.Petal.Width = mean(Petal.Width))
}

testFunction(iris, Species)

and I get an error "... unknown variable to group by: x" I've tried group_by_ and it gives me a summary of the entire dataset. Anybody have a clue how I can fix this?

Thanks in advance!

回答1:

Here is one way to work with the new enquo from dplyr, where enquo takes the string and converts to quosure which gets evaluated by unquoting (UQ or !!) in group_by, mutate, summarise etc.

library(dplyr)
testFunction <- function(df, x) {
 x <- enquo(x)
  df %>%
    group_by(!! x) %>%
     summarize(mean.Petal.Width = mean(Petal.Width))
 }

testFunction(iris, Species)
# A tibble: 3 x 2
#     Species mean.Petal.Width
#      <fctr>            <dbl>
#1     setosa            0.246
#2 versicolor            1.326
#3  virginica            2.026


回答2:

I got it to work like this:

testFunction <- function(df, x) {
                      df %>%
                         group_by(get(x)) %>%
                         summarize(mean.Petal.Width = mean(Petal.Width))
                 }

testFunction(iris,"Species")

I changed x to get(x), and Species to "Species" in testFunction(iris,...).