dplyr mutate using variable columns

2019-06-23 18:32发布

问题:

I am trying to use mutate to create a new column with values based on a specific column.

Example final data frame (I am trying to create new_col):

x = tibble(colA = c(11, 12, 13),
           colB = c(91, 92, 93),
           col_to_use = c("colA", "colA", "colB"),
           new_col = c(11, 12, 93))

I would like to do something like:

x %>% mutate(new_col = col_to_use)

Except instead of column contents, I would like to transform them to a variable. I started with:

col_name = "colA"
x %>% mutate(new_col = !!as.name(col_name))

That works with a static variable. However, I have been unable to change the variable to represent the column. How do I take a column name based on contents of a different column?

This question is basically the opposite of this: dplyr - mutate: use dynamic variable names. I wasn't able to adapt the solution to my problem.

回答1:

We can use imap_dbl and pluck from the purrr package to achieve this task.

library(tidyverse)

x <- tibble(colA = c(11, 12, 13),
           colB = c(91, 92, 93),
           col_to_use = c("colA", "colA", "colB"))

x2 <- x %>%
  mutate(new_col = imap_dbl(col_to_use, ~pluck(x, .x, .y)))

x2
# # A tibble: 3 x 4
#   colA  colB col_to_use new_col
#  <dbl> <dbl> <chr>        <dbl>
# 1   11.   91. colA           11.
# 2   12.   92. colA           12.
# 3   13.   93. colB           93.


回答2:

I'm not sure how to do it with tidyverse idioms alone (though I assume there's a way). But here's a method using apply:

x$new_col = apply(x, 1, function(d) {
  d[match(d["col_to_use"], names(x))]
})
  colA colB col_to_use new_col
1   11   91       colA      11
2   12   92       colA      12
3   13   93       colB      93

Or, putting the apply inside mutate:

x = x %>% 
  mutate(new_col = apply(x, 1, function(d) {
    d[match(d["col_to_use"], names(x))]
  }))