R flatten nested data.table

2019-01-20 14:01发布

问题:

I have nested data table, what is the best way to flatten nested tables to one data.table? Note, that nested tables can have unequal length, so tidyr::unnest() does not work.

Code example:

NestedTable <- data.table(
  COLUMN1 = c('var1', 'var2','var3'),
  COLUMN2 = c('col2a', 'col2b', 'col2c')
)

# add nested data.tables
NestedTable[ , NESTED_COL := list(list(data.table(
  COLUMN4 = c(
    'A', 'B'
  ), 
  COLUMN5 = c(
    'C', 'D'
  )
)))]

NestedTable[ , NESTED_COL2 := list(list(data.table(
  COLUMN6 = c(
    'A', 'B','C'
  ), 
  COLUMN7 = c(
    'C', 'D','E'
  )
)))]

#      COLUMN1 COLUMN2 COLUMN4 COLUMN5 COLUMN6 COLUMN7
# 1:    var1   col2a   A       C       A       C
# 2:    var1   col2a   B       D       B       D
# 3:    var1   col2a                   C       E
# 4:    var2   col2b   A       C       A       C
# 5:    var2   col2b   B       D       B       D
# 6:    var2   col2b                   C       E
# 7:    var3   col2c   A       C       A       C
# 8:    var3   col2c   B       D       B       D
# 9:    var3   col2c                   C       E

回答1:

We can try with unlist

NestedTable[, unlist(NESTED_COL, recursive=FALSE), .(COLUMN1, COLUMN2)]


回答2:

You can use tidyr :

NestedTable %>%
  tidyr::unnest(NESTED_COL)
#>    COLUMN1 COLUMN2 COLUMN4 COLUMN5
#> 1:    var1   col2a       A       C
#> 2:    var1   col2a       B       D
#> 3:    var2   col2b       A       C
#> 4:    var2   col2b       B       D
#> 5:    var3   col2c       A       C
#> 6:    var3   col2c       B       D