Flatten list column in data frame with ID column

My data frame contains the output of a survey with a select multiple question type. Some cells have multiple values.

df <- data.frame(a=1:3,b=I(list(1,1:2,1:3)))
df
  a       b
1 1       1
2 2    1, 2
3 3 1, 2, 3

I would like to flatten out the list to obtain the following output:

df
  a       b
1 1       1
2 2       1
3 2       2
4 3       1
5 3       2
6 3       3

should be easy but somehow I can't find the search terms. thanks.

标签： r dataframe reshape

4条回答

贪生不怕死

2楼-- · 2019-02-21 12:44

Here's one way, with data.table:

require(data.table)
data.table(df)[,as.integer(unlist(b)),by=a]

If b is stored consistently, as.integer can be skipped. You can check with

unique(sapply(df$b,class))
# [1] "numeric" "integer"

0人赞添加讨论(0) 举报

小情绪 Triste *

3楼-- · 2019-02-21 12:50

Here's another base solution, far less elegant than any other solution posted thus far. Posting for the sake of completeness, though personally I would recommend akrun's base solution.

with(df, cbind(a = rep(a, sapply(b, length)), b = do.call(c, b)))

This constructs the first column as the elements of a, where each is repeated to match the length of the corresponding list item from b. The second column is b "flattened" using do.call() with c().

As Ananda Mahto pointed out in a comment, sapply(b, length) can be replaced with lengths(b) in the most recent version of R (3.2, if I'm not mistaken).

0人赞添加讨论(0) 举报

beautiful°

4楼-- · 2019-02-21 12:55

You can just use unnest from "tidyr":

library(tidyr)
unnest(df, b)
#   a b
# 1 1 1
# 2 2 1
# 3 2 2
# 4 3 1
# 5 3 2
# 6 3 3

0人赞添加讨论(0) 举报

霸刀☆藐视天下

5楼-- · 2019-02-21 13:06

Using base R, one option is stack after naming the list elements of 'b' column with that of the elements of 'a'. We can use setNames to change the names.

stack(setNames(df$b, df$a))

Or another option would be to use unstack to automatically name the list element of 'b' with 'a' elements and then do the stack to get a data.frame output.

stack(unstack(df, b~a))

Or we can use a convenient function listCol_l from splitstackshape to convert the list to data.frame.

library(splitstackshape)
listCol_l(df, 'b')

0人赞添加讨论(0) 举报

Flatten list column in data frame with ID column

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间