I'd like to be able to use dplyr
's case_when
in a programmatic way to replace base R cut()
function.
Currently, case_when can be used with an external argument through NSE like :
library(dplyr)
library(rlang)
patterns <- list(
x <= 2 ~ "<=2",
x <= 4 ~ "2<->4",
x > 4 ~ ">4"
)
x <- 1:10
case_when(!!!patterns)
What I want to do is : use it with another variable, inside a mutate
The idea would be something like this, although I can't figure out how to get it to work :
library(dplyr)
patterns_lazy <- list(
!!quo(x) <= 2 ~ "<=2",
!!quo(x) <= 4 ~ "2<->4",
!!quo(x) > 4 ~ ">4"
)
x <- "cyl"
mtcars %>% mutate(ABC = case_when(!!!patterns_lazy))
I'd like to be able to define the column (inside a string) that I want to filter on, and retrieve something like this (this example is not working as it's the desired syntax):
x <- "cyl"
mtcars %>%
select(cyl) %>%
mutate(ABC = case_when(!!!patterns_lazy)) %>%
head()
cyl ABC
1 6 >4
2 6 >4
3 4 2<->4
4 6 >4
5 8 >4
6 6 >4
Thanks for any help :)
You cannot use !!
there:
patterns <- list(
!!quo(x) <= 2 ~ "<=2",
!!quo(x) <= 4 ~ "2<->4",
!!quo(x) > 4 ~ ">4"
)
- Neither
list()
nor ~
support quasiquotation.
- If it did support quasiquotation, you'd need to be careful with operator precedence and wrap your
!!quo()
in parentheses.
- And finally, that quote to
x
would evaluate to a string and you'd be comparing numbers with strings (in your example it would be "cyl
) which R will do happily thanks to implicit coercions :/
So you need to use exprs()
instead of list()
, and use x
with the
.data
pronoun instead of a quote of x
.
exprs()
will create a list of unevaluated expressions. Unevaluated
is good: if your formula was evaluated, it would carry an environment
(here the global env) and that environment doesn't contain any of the
data supplied to dplyr, and in particular doesn't have the .data
pronoun. On the other hand if the formulas are "context-less", they
get evaluated in the data context which is what we want.
patterns_lazy <- exprs(
.data[[x]] <= 2 ~ "<=2",
.data[[x]] <= 4 ~ "2<->4",
.data[[x]] > 4 ~ ">4"
)
x <- "cyl"
pull(mutate(mtcars, case_when(!!!patterns_lazy)))
#> [1] ">4" ">4" "2<->4" ">4" ">4" ">4" ">4" "2<->4" "2<->4"
#> [10] ">4" ">4" ">4" ">4" ">4" ">4" ">4" ">4" "2<->4"
#> [19] "2<->4" "2<->4" "2<->4" ">4" ">4" ">4" ">4" "2<->4" "2<->4"
#> [28] "2<->4" ">4" ">4" ">4" "2<->4"
Here is one option with ifelse
f1 <- function(data, x){
x <- enquo(x)
f2 <- function(y) ifelse(y <= 2, "<=2", ifelse(y <=4, "2<->4", ">4"))
data %>%
mutate( ABC = f2(UQ(x)))
}
f1(mtcars, cyl) %>%
head()
# mpg cyl disp hp drat wt qsec vs am gear carb ABC
#1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 >4
#2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 >4
#3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 2<->4
#4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 >4
#5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 >4
#6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 >4