How to convert a data frame to a 3d array in R

2020-02-09 08:53发布

问题:

I have a data frame that I want to convert to a three-dimensional array. One of the columns in the data frame should serve as the grouping variable for splitting the frame into 2d matrices that can be combined into the array. In the following minimal working example, the data frame should be split into matrices by the variable "i", then combined into a 4x4x2 array. The solution should be practical for large data sets and ideally could be generalized to convert a data frame into a n dimensional array.

# Make reproducible 
set.seed(123)

df <- {
  data.frame(i=rep(1:2, each=4),
             x=rep(rep(0:1, each=2), 2),
             y=rep(rep(0:1, 2), 2),
             l=rnorm(8))
}

df
#   i x y           l
# 1 1 0 0 -0.56047565
# 2 1 0 1 -0.23017749
# 3 1 1 0  1.55870831
# 4 1 1 1  0.07050839
# 5 2 0 0  0.12928774
# 6 2 0 1  1.71506499
# 7 2 1 0  0.46091621
# 8 2 1 1 -1.26506123

Note: I suspect that Hadley Wickham's plyr may provide the needed tool, perhaps daply?

回答1:

Here is what I'd probably do:

library(abind)
abind(split(df, df$i), along=3)
# , , 1
# 
#   i x y           l
# 5 1 0 0 -0.56047565
# 6 1 0 1 -0.23017749
# 7 1 1 0  1.55870831
# 8 1 1 1  0.07050839
# 
# , , 2
# 
#   i x y          l
# 5 2 0 0  0.1292877
# 6 2 0 1  1.7150650
# 7 2 1 0  0.4609162
# 8 2 1 1 -1.2650612


回答2:

It sounds like you are looking for split:

> split(df, df$i)
$`1`
  i x y           l
1 1 0 0 -0.56047565
2 1 0 1 -0.23017749
3 1 1 0  1.55870831
4 1 1 1  0.07050839

$`2`
  i x y          l
5 2 0 0  0.1292877
6 2 0 1  1.7150650
7 2 1 0  0.4609162
8 2 1 1 -1.2650612

This results in a list of two data.frames separated by your "i" column.


To get an array, you've got Josh's answer, or you can use simplify2array from base R:

> simplify2array(by(df, df$i, as.matrix))
, , 1

  i x y           l
1 1 0 0 -0.56047565
2 1 0 1 -0.23017749
3 1 1 0  1.55870831
4 1 1 1  0.07050839

, , 2

  i x y          l
1 2 0 0  0.1292877
2 2 0 1  1.7150650
3 2 1 0  0.4609162
4 2 1 1 -1.2650612


回答3:

Maybe I'm reading the question wrong, but the MWE describes a 2x2x2 array (x, y, i (a.k.a. z)). The current answers appear to provide solutions that provide arrays of data.frames rather than arrays of 2D matrices (per OP). array() will convert a data.frame to an array of n-dimensional matrices:

dfa <- array(data = df$l, 
             dim=c(length(unique(df$x)), 
                   length(unique(df$y)), 
                   length(unique(df$i))), 
             dimnames=list(unique(df$x), unique(df$y), unique(df$i))
            )
dfa
> dfa
, , 1

           0          1
0 -0.5604756 1.55870831
1 -0.2301775 0.07050839

, , 2

          0          1
0 0.1292877  0.4609162
1 1.7150650 -1.2650612


标签: arrays r