Tidy evaluation programming with dplyr::case_when

2019-04-19 16:29发布

问题:

I try to write a simple function wrapping around the dplyr::case_when() function. I read the programming with dplyr documentation on https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html but can't figure out how this works with the case_when() function.

I have the following data:

data <- tibble(
   item_name = c("apple", "bmw", "bmw")
)

And the following list:

cat <- list(
   item_name == "apple" ~ "fruit",
   item_name == "bmw" ~ "car"
)

Then I would like to write a function like:

category_fn <- function(df, ...){
   cat1 <- quos(...)
   df %>%
     mutate(category = case_when((!!!cat1)))
}

Unfortunately category_fn(data,cat) gives an evaluation error in this case. I would like to obtain the same output as the output obtained by:

data %>% 
   mutate(category = case_when(item_name == "apple" ~ "fruit",
                               item_name == "bmw" ~ "car"))

What is the way to do this?

回答1:

Quote each element of your list first:

cat <- list(
  quo(item_name == "apple" ~ "fruit"),
  quo(item_name == "bmw" ~ "car")
)

Your function does not then have to quote the cat object itself. I have also changed the use of the "everything else" ... argument to refer to the category argument explicitly in the call:

category_fn <- function(df, categories){
  df %>%
    mutate(category = case_when(!!!categories))
}

The output of the function is then as expected:

category_fn(data, cat)
# A tibble: 3 x 2
  item_name category
      <chr>    <chr>
1     apple    fruit
2       bmw      car
3       bmw      car

For completeness, I note that the category list works with your function when defined using the base R quote() function too:

cat <- list(
  quote(item_name == "apple" ~ "fruit"),
  quote(item_name == "bmw" ~ "car")
)
> cat
[[1]]
item_name == "apple" ~ "fruit"

[[2]]
item_name == "bmw" ~ "car"

> category_fn(data, cat)
# A tibble: 3 x 2
  item_name category
      <chr>    <chr>
1     apple    fruit
2       bmw      car
3       bmw      car


回答2:

1) pass list Using let from the wrapr package and data and cat from the question this works without modifying the inputs in any way.

library(dplyr)
library(wrapr)

category_fn <- function(data, List) {
  let(c(CATEGORY = toString(sapply(List, format))),
      data %>% mutate(category = case_when(CATEGORY)),
      subsMethod = "stringsubs",
      strict = FALSE)
}
category_fn(data, cat) # test

giving:

# A tibble: 3 x 2
  item_name category
      <chr>    <chr>
1     apple    fruit
2       bmw      car
3       bmw      car

1a) Using tidyeval/rlang and data and cat from the question:

category_fn <- function(data, List) {
  cat_ <- lapply(List, function(x) do.call("substitute", list(x)))
  data %>% mutate(category = case_when(!!!cat_))
}
category_fn(data, cat)

giving same result as above.

2) pass list components separately If your intention was to pass each component of cat separately instead of cat itself then this works:

category_fn <- function(data, ...) eval.parent(substitute({
   data %>% mutate(category = case_when(...))
}))

category_fn(data, item_name == "apple" ~ "fruit",
                   item_name == "bmw" ~ "car") # test

giving:

# A tibble: 3 x 2
  item_name category
      <chr>    <chr>
1     apple    fruit
2       bmw      car
3       bmw      car

2a) If you prefer tidyeval/rlang then this case is straight forward:

library(dplyr)
library(rlang)

category_fn <- function(data, ...) {
   cat_ <- quos(...)
   data %>% mutate(category = case_when(!!!cat_))
}

category_fn(data, item_name == "apple" ~ "fruit",
                   item_name == "bmw" ~ "car") # test