Is it possible to use dplyr's mutate function without hard-coding the variable names? For example, the following code works, because I hard-code the name Var1:
> d=expand.grid(1:3, 20:22)
> d
Var1 Var2
1 1 20
2 2 20
3 3 20
4 1 21
5 2 21
6 3 21
7 1 22
8 2 22
9 3 22
> d=mutate(d, x=percent_rank(Var1))
> d
Var1 Var2 x
1 1 20 0.000
2 2 20 0.375
3 3 20 0.750
4 1 21 0.000
5 2 21 0.375
6 3 21 0.750
7 1 22 0.000
8 2 22 0.375
9 3 22 0.750
However, when I make the variable's name a variable, it no longer works:
> my.variable='Var1'
> d=mutate(d, x=percent_rank(my.variable))
> d
Var1 Var2 x
1 1 20 NaN
2 2 20 NaN
3 3 20 NaN
4 1 21 NaN
5 2 21 NaN
6 3 21 NaN
7 1 22 NaN
8 2 22 NaN
9 3 22 NaN
The eval() and as.symbol() functions don't seem to help, either.
In the devel version of
dplyr
(awaiting new release0.6.0
), with the introduction ofquosures
and unquote functions (!!
,UQ
) to evaluate the quotes ingroup_by/summarise/mutate
, this becomes more easierIt also has other features to pass column names
We can also create a function and pass the argument
In the above function,
enquo
does the similar functionality assubstitute
frombase R
in taking the user input arguments and converting it toquosure
. As we need column name in string, we can usequo_name
to do the conversion to string and the evaluation inside themutate
call is done by unquoting (!!
orUQ
)data
The great Hadley Wickham himself (hallowed be his name!) suggested this on the
mutatr
Google Groups:You can use
get
and precise the environment in which the object "Var1" is.I suggest you to read more about "non-standard evaluation" on the "Advanced R programming" wiki by Hadley Wickham : http://adv-r.had.co.nz/Computing-on-the-language.html
EDIT
This answer was recently voted so I realized that the solution I gave a year and a half ago was not really great and I take this opportunity to upgrade my answer.
Since dplyr 0.3 you can use standard evaluation version of dplyr's functions, using their "fun_" versions.
Also you have to use
interp
from lazyeval package if you are doing some computations on the variables :