Looking for a more efficient / elegant way to pass multiple arguments to a group-by using non-standard evaluation in a function using dplyr. I don't want to use the ... operator, but to specify the functions individually.
My specific use case is a function which takes a data frame and creates a ggplot object with simpler syntax. Here is an example of the code I want to automate with my function:
# create data frame
my_df <- data.frame(month = sample(1:12, 1000, replace = T),
category = sample(head(letters, 3), 1000, replace = T),
approved = as.numeric(runif(1000) < 0.5))
my_df$converted <- my_df$approved * as.numeric(runif(1000) < 0.5)
my_df %>%
group_by(month, category) %>%
summarize(conversion_rate = sum(converted) / sum(approved)) %>%
ggplot + geom_line(aes(x = month, y = conversion_rate, group = category,
color = category))
I want to combine that group_by, summarize, ggplot, and geom_line into a simple function that I can feed an x, y, and group, and have it perform all the dirty work under the hood. Here's what I've gotten to work:
# create the function that does the grouping and plotting
plot_lines <- function(df, x, y, group) {
x <- enquo(x)
group <- enquo(group)
group_bys <- quos(!! x, !! group)
df %>%
group_by(!!! group_bys) %>%
my_smry %>%
ggplot + geom_line(aes_(x = substitute(x), y = substitute(y),
group = substitute(group), color = substitute(group)))
}
# create a function to do the summarization
my_smry <- function(x) {
x %>%
summarize(conversion_rate = sum(converted) / sum(approved))
}
# use my function
my_df %>%
plot_lines(x = month, y = conversion_rate, group = category)
I feel like the group_by handling is pretty inelegant: quoting x
and group
with enquo
, then unquoting them with !!
inside of another quoting function quos
, only to re-unquote them with !!!
on the next line, but it's the only thing I've been able to get to work. Is there a better way to do this?
Also, is there a way to get ggplot to take !!
instead of substitute
? What I'm doing feels inconsistent.
The problem is that ggplot hasn't been updated to handle quosures yet, so you've got to pass it expressions, which you can create from quosures with
rlang::quo_expr
:However, keep in mind that ggplot will almost inevitably be updated from lazyeval to rlang, so while this interface will probably keep working, a simpler, more consistent one will probably be possible shortly.
You could just do a straight
eval.parent(substitute(...))
like this. Being base R it works consistently across R and is simple to do. One can even use an ordinaryaes
.