Looking for a more efficient / elegant way to pass multiple arguments to a group-by using non-standard evaluation in a function using dplyr. I don't want to use the ... operator, but to specify the functions individually.
My specific use case is a function which takes a data frame and creates a ggplot object with simpler syntax. Here is an example of the code I want to automate with my function:
# create data frame
my_df <- data.frame(month = sample(1:12, 1000, replace = T),
category = sample(head(letters, 3), 1000, replace = T),
approved = as.numeric(runif(1000) < 0.5))
my_df$converted <- my_df$approved * as.numeric(runif(1000) < 0.5)
my_df %>%
group_by(month, category) %>%
summarize(conversion_rate = sum(converted) / sum(approved)) %>%
ggplot + geom_line(aes(x = month, y = conversion_rate, group = category,
color = category))
I want to combine that group_by, summarize, ggplot, and geom_line into a simple function that I can feed an x, y, and group, and have it perform all the dirty work under the hood. Here's what I've gotten to work:
# create the function that does the grouping and plotting
plot_lines <- function(df, x, y, group) {
x <- enquo(x)
group <- enquo(group)
group_bys <- quos(!! x, !! group)
df %>%
group_by(!!! group_bys) %>%
my_smry %>%
ggplot + geom_line(aes_(x = substitute(x), y = substitute(y),
group = substitute(group), color = substitute(group)))
}
# create a function to do the summarization
my_smry <- function(x) {
x %>%
summarize(conversion_rate = sum(converted) / sum(approved))
}
# use my function
my_df %>%
plot_lines(x = month, y = conversion_rate, group = category)
I feel like the group_by handling is pretty inelegant: quoting x
and group
with enquo
, then unquoting them with !!
inside of another quoting function quos
, only to re-unquote them with !!!
on the next line, but it's the only thing I've been able to get to work. Is there a better way to do this?
Also, is there a way to get ggplot to take !!
instead of substitute
? What I'm doing feels inconsistent.