R dplyr choose value from column with column name

2020-04-07 08:25发布

问题:

Hi I have a data frame with different data columns (say x, y, z ) and another column which specifies which one to choose. I want to use dplyr::mutate (or similar) to make a new column that has the value corresponding to value in the column specified by "choose". But I want to keep all the columns in the first place. In my real data I also have a few other columns with metadata.

Example data:

library(dplyr)
testdf <- data.frame(x = 1:5, y = 11:15, z = 101:105, choose = c("z","y","x","y","z"))

I can make this work in my example using case_when but in my actual script the column names and choose column are generated and may have different values, so I don't want to hardcode what names there could be.

Desired output/test

mutate(testdf, selectedValue = case_when(choose == "x" ~x,
                                     choose == "y"~ y,
                                     choose == "z"~ z, T~NA_integer_))

#>   x  y   z choose selectedValue
#> 1 1 11 101      z           101
#> 2 2 12 102      y            12
#> 3 3 13 103      x             3
#> 4 4 14 104      y            14
#> 5 5 15 105      z           105

Created on 2019-09-18 by the reprex package (v0.3.0)

回答1:

Here is a data.table solution. This is a case where I don't think the dplyr solutions are more human readable than other solutions (base r and data.table specifically).

library(data.table)
testdt <- data.table(x = 1:5, y = 11:15, z = 101:105, choose = c("z","y","x","y","z"))
testdt[,selectedValue := get(choose), by = choose]
testdt
#>    x  y   z choose selectedValue
#> 1: 1 11 101      z           101
#> 2: 2 12 102      y            12
#> 3: 3 13 103      x             3
#> 4: 4 14 104      y            14
#> 5: 5 15 105      z           105

Created on 2019-09-17 by the reprex package (v0.3.0)



回答2:

If you use purrr with dplyr:

testdf %>%
    mutate(selectedValue = purrr::map2_dbl(row_number(),choose,~testdf[.x,.y]))



回答3:

A base R way would be to create a row/column index matrix to subset from testdf. We use match to get column index and create a sequential counter for row.

testdf$selectedvalue <- testdf[cbind(seq_len(nrow(testdf)), 
                               match(testdf$choose, names(testdf)))]
testdf
#  x  y   z choose selectedvalue
#1 1 11 101      z           101
#2 2 12 102      y            12
#3 3 13 103      x             3
#4 4 14 104      y            14
#5 5 15 105      z           105


标签: r dplyr