I have a custom function where I am reading entered variables from a dataframe using rlang
. This function works just fine irrespective of whether the arguments entered are quoted or unquoted. But, strangely enough, when this function is used with purrr::pmap
, it works only if the argument is quoted.
So I have two questions:
Why does the function behavior this way?
How can I make a function using rlang
such that I won't have to quote the arguments even if used in purrr::pmap
?
Here is a minimal reprex that uses a simple function to highlight this issue:
# loading the needed libraries
library(rlang)
library(dplyr)
library(purrr)
# defining the function
tryfn <- function(data, x, y) {
data <-
dplyr::select(
.data = data,
x = !!rlang::enquo(x),
y = !!rlang::enquo(y)
)
# creating a dataframe of means
result_df <- data.frame(mean.x = mean(data$x), mean.y = mean(data$y))
# return the dataframe
return(result_df)
}
# without quotes (works!)
tryfn(iris, Sepal.Length, Sepal.Width)
#> mean.x mean.y
#> 1 5.843333 3.057333
# with quotes (works!)
tryfn(iris, "Sepal.Length", "Sepal.Width")
#> mean.x mean.y
#> 1 5.843333 3.057333
# pmap without quotes (doesn't work)
purrr::pmap(.l = list(
data = list(iris, mtcars, ToothGrowth),
x = list(Sepal.Length, wt, len),
y = list(Sepal.Width, mpg, dose)
),
.f = tryfn)
#> Error in is.data.frame(.l): object 'Sepal.Length' not found
# pmap with quotes (works!)
purrr::pmap(.l = list(
data = list(iris, mtcars, ToothGrowth),
x = list("Sepal.Length", "wt", "len"),
y = list("Sepal.Width", "mpg", "dose")
),
.f = tryfn)
#> [[1]]
#> mean.x mean.y
#> 1 5.843333 3.057333
#>
#> [[2]]
#> mean.x mean.y
#> 1 3.21725 20.09062
#>
#> [[3]]
#> mean.x mean.y
#> 1 18.81333 1.166667
Created on 2018-05-21 by the reprex package (v0.2.0).
The problem was: R saw Sepal.Length, wt, len
symbols so it tried to look in the current environment and evaluated them. Of course it resulted in errors as they were columns of a data frame. When you quoted them, R didn't try to evaluate and returned values as it saw those as strings.
If you replace list
with base::alist
or dplyr::vars
or rlang::exprs
, it should work
Note: as we already quote the inputs, we don't need to use rlang::enquo
inside tryfn
anymore.
# loading the needed libraries
library(rlang)
library(tidyverse)
# defining the function
tryfn <- function(data, x, y) {
data <-
dplyr::select(
.data = data,
x = !! x,
y = !! y
)
# creating a data frame of means
result_df <- data.frame(mean.x = mean(data$x), mean.y = mean(data$y))
# return the data frame
return(result_df)
}
# alist handles its arguments as if they described function arguments.
# So the values are not evaluated, and tagged arguments with no value are
# allowed whereas list simply ignores them.
purrr::pmap(.l = list(
data = list(iris, mtcars, ToothGrowth),
x = alist(Sepal.Length, wt, len),
y = alist(Sepal.Width, mpg, dose)
),
.f = tryfn)
#> [[1]]
#> mean.x mean.y
#> 1 5.843333 3.057333
#>
#> [[2]]
#> mean.x mean.y
#> 1 3.21725 20.09062
#>
#> [[3]]
#> mean.x mean.y
#> 1 18.81333 1.166667
purrr::pmap(.l = list(
data = list(iris, mtcars, ToothGrowth),
x = dplyr::vars(Sepal.Length, wt, len),
y = dplyr::vars(Sepal.Width, mpg, dose)
),
.f = tryfn)
#> [[1]]
#> mean.x mean.y
#> 1 5.843333 3.057333
#>
#> [[2]]
#> mean.x mean.y
#> 1 3.21725 20.09062
#>
#> [[3]]
#> mean.x mean.y
#> 1 18.81333 1.166667
purrr::pmap(.l = list(
data = list(iris, mtcars, ToothGrowth),
x = rlang::exprs(Sepal.Length, wt, len),
y = rlang::exprs(Sepal.Width, mpg, dose)
),
.f = tryfn)
#> [[1]]
#> mean.x mean.y
#> 1 5.843333 3.057333
#>
#> [[2]]
#> mean.x mean.y
#> 1 3.21725 20.09062
#>
#> [[3]]
#> mean.x mean.y
#> 1 18.81333 1.166667
Created on 2018-05-21 by the reprex package (v0.2.0).
The issue isn't with purrr
, really. The same behavior can be observed with:
list(Sepal.Length) # Error: object 'Sepal.Length' not found
As I understand it, all of the magic with !!
, enquo
, and the like is available when you're passing arguments into a function you have created. That's why it works to pass in the unquoted field names to tryfn()
directly.
But with pmap()
, you're putting the field names (Sepal.Width
, wt
, etc) in a list
definition, and list
doesn't like that - so pmap
never even gets a chance to pass things into tryfn
since your list
barfs on definition.
Passing in your field names as strings works just fine, as list
can accommodate that data type, and then pmap
has the chance to map them into tryfn()
.
Hadley's review of quasiquotation with dplyr
might be useful to you.
To answer your second question:
How can I make a function using rlang such that I won't have to quote the arguments even if used in purrr::pmap?
You can wrap your field names with quo()
to avoid literally quoting them as strings, although I'm not sure that's much of an improvement:
purrr::pmap(.l = list(
data = list(iris, mtcars, ToothGrowth),
x = list(quo(Sepal.Length), quo(wt), quo(len)),
y = list(quo(Sepal.Width), quo(mpg), quo(dose))
),
.f = tryfn) %>%
bind_rows(., .id="dataset")
dataset mean.x mean.y
1 1 5.843333 3.057333
2 2 3.217250 20.090625
3 3 18.813333 1.166667