dplyr - mutate: use dynamic variable names

2018-12-31 08:10发布

I want to use dplyr's mutate() to create multiple new columns in a data frame. The column names and their contents should be dynamically generated.

Example data from iris:

require(dplyr)
data(iris)
iris <- tbl_df(iris)

I've created a function to mutate my new columns from the Petal.Width variable:

multipetal <- function(df, n) {
    varname <- paste("petal", n , sep=".")
    df <- mutate(df, varname = Petal.Width * n)  ## problem arises here
    df
}

Now I create a loop to build my columns:

for(i in 2:5) {
    iris <- multipetal(df=iris, n=i)
}

However, since mutate thinks varname is a literal variable name, the loop only creates one new variable (called varname) instead of four (called petal.2 - petal.5).

How can I get mutate() to use my dynamic name as variable name?

标签: r dplyr r-faq
7条回答
美炸的是我
2楼-- · 2018-12-31 08:54

After a lot of trial and error, I found the pattern UQ(rlang::sym("some string here"))) really useful for working with strings and dplyr verbs. It seems to work in a lot of surprising situations.

Here's an example with mutate. We want to create a function that adds together two columns, where you pass the function both column names as strings. We can use this pattern, together with the assignment operator :=, to do this.

## Take column `name1`, add it to column `name2`, and call the result `new_name`
mutate_values <- function(new_name, name1, name2){
  mtcars %>% 
    mutate(UQ(rlang::sym(new_name)) :=  UQ(rlang::sym(name1)) +  UQ(rlang::sym(name2)))
}
mutate_values('test', 'mpg', 'cyl')

The pattern works with other dplyr functions as well. Here's filter:

## filter a column by a value 
filter_values <- function(name, value){
  mtcars %>% 
    filter(UQ(rlang::sym(name)) != value)
}
filter_values('gear', 4)

Or arrange:

## transform a variable and then sort by it 
arrange_values <- function(name, transform){
  mtcars %>% 
    arrange(UQ(rlang::sym(name)) %>%  UQ(rlang::sym(transform)))
}
arrange_values('mpg', 'sin')

For select, you don't need to use the pattern. Instead you can use !!:

## select a column 
select_name <- function(name){
  mtcars %>% 
    select(!!name)
}
select_name('mpg')
查看更多
登录 后发表回答