Not able to figure out how to use column names in a function using dplyr
R
package. Reproducible example is below:
Data
set.seed(12345)
Y <- rnorm(10)
Env <- paste0("E", rep(1:2, each = 5))
Gen <- paste0("G", rep(1:5, times = 2))
df1 <- data.frame(Y, Env, Gen)
Works outside function
library(dplyr)
df1 %>%
dplyr::group_by(E, G) %>%
dplyr::summarize(mean(Y))
with(data = df1, expr = tapply(X = Y, INDEX = list(E, G), FUN = mean))
First function
fn1 <- function(Y, E, G, data){
Y <- deparse(substitute(Y))
E <- deparse(substitute(E))
G <- deparse(substitute(G))
Out <- with(data = data, tapply(X = Y, INDEX = list(E, G), FUN = mean), parent.frame())
return(Out)
}
fn1(Y = Y, E = Env, G = Gen, data = df1)
Error in tapply(X = Y, INDEX = list(E, G), FUN = mean) : arguments must have same length
Second function
fn2 <- function(Y, E, G, data){
Y <- deparse(substitute(Y))
E <- deparse(substitute(E))
G <- deparse(substitute(G))
library(dplyr)
Out <- df1 %>%
dplyr::group_by(E, G) %>%
dplyr::summarize(mean(Y))
return(Out)
}
fn2(Y = Y, E = Env, G = Gen, data = df1)
Error in grouped_df_impl(data, unname(vars), drop) : Column
E
is unknown
One option would to use the
enquo
to capture the expression and its environment in aquosure
object which can be evaluated within thegroup_by
,summarise
,mutate
etc by using!!
operator orUQ
(unquote expression)In the Op's function, while the expression is captured by
substitute
, withdeparse
, it is converted to a string. By usingsym
fromrlang
, this can be converted to symbol and then evaluated with!!
orUQ
as aboveAnother variant of the OP's function without using
rlang
would be to make use ofgroup_by_at
orsummarise_at
which can take strings as argument