In the following example, why should we favour using f1
over f2
? Is it more efficient in some sense? For someone used to base R, it seems more natural to use the "substitute + eval" option.
library(dplyr)
d = data.frame(x = 1:5,
y = rnorm(5))
# using enquo + !!
f1 = function(mydata, myvar) {
m = enquo(myvar)
mydata %>%
mutate(two_y = 2 * !!m)
}
# using substitute + eval
f2 = function(mydata, myvar) {
m = substitute(myvar)
mydata %>%
mutate(two_y = 2 * eval(m))
}
all.equal(d %>% f1(y), d %>% f2(y)) # TRUE
In other words, and beyond this particular example, my question is: can I get get away with programming using dplyr
NSE functions with good ol' base R like substitute+eval, or do I really need to learn to love all those rlang
functions because there is a benefit to it (speed, clarity, compositionality,...)?
I want to give an answer that is independent of
dplyr
, because there is a very clear advantage to usingenquo
oversubstitute
. Both look in the calling environment of a function to identify the expression that was given to that function. The difference is thatsubstitute()
does it only once, while!!enquo()
will correctly walk up the entire calling stack.Consider a simple function that uses
substitute()
:This functionality breaks when the call is nested inside another function:
Now consider the same functions re-written using
enquo()
:And that is why
enquo()
+!!
is preferable tosubstitute()
+eval()
.dplyr
simply takes full advantage of this property to build a coherent set of NSE functions.UPDATE:
rlang 0.4.0
introduced a new operator{{
(pronounced "curly curly"), which is effectively a short hand for!!enquo()
. This allows us to simplify the definition ofg2
toenquo()
and!!
also allows you to program with otherdplyr
verbs such asgroup_by
andselect
. I'm not sure ifsubstitute
andeval
can do that. Take a look at this example where I modify your data frame a little bitEdit: also
enquos
&!!!
make it easier to capture list of variablesCredit: Programming with dplyr
To add some nuance, these things are not necessarily that complex in base R.
It is important to remember to use
eval.parent()
when relevant to evaluate substituted arguments in the right environment, if you useeval.parent()
properly the expression in nested calls will find their ways. If you don't you might discover environment hell :).The base tool box that I use is made of
quote()
,substitute()
,bquote()
,as.call()
, anddo.call()
(the latter useful when used withsubstitute()
Without going into details here is how to solve in base R the cases presented by @Artem and @Tung, without any tidy evaluation, and then the last example, not using
quo
/enquo
, but still benefiting from splicing and unquoting (!!!
and!!
)We'll see that splicing and unquoting makes code nicer (but requires functions to support it!), and that in the present cases using quosures doesn't improve things dramatically (but still arguably does).
solving Artem's case with base R
solving Tung's 1st case with base R
solving Tung's 2nd case with base R
in a function:
solving Tung's 2nd case with base R but using
!!
and!!!
in a function :
Imagine there is a different x you want to multiply:
vs without the
!!
:!!
gives you more control over scoping thansubstitute
- with substitute you can only get the 2nd way easily.