I have a data frame that I want to convert to a three-dimensional array. One of the columns in the data frame should serve as the grouping variable for splitting the frame into 2d matrices that can be combined into the array. In the following minimal working example, the data frame should be split into matrices by the variable "i", then combined into a 4x4x2 array. The solution should be practical for large data sets and ideally could be generalized to convert a data frame into a n dimensional array.
# Make reproducible
set.seed(123)
df <- {
data.frame(i=rep(1:2, each=4),
x=rep(rep(0:1, each=2), 2),
y=rep(rep(0:1, 2), 2),
l=rnorm(8))
}
df
# i x y l
# 1 1 0 0 -0.56047565
# 2 1 0 1 -0.23017749
# 3 1 1 0 1.55870831
# 4 1 1 1 0.07050839
# 5 2 0 0 0.12928774
# 6 2 0 1 1.71506499
# 7 2 1 0 0.46091621
# 8 2 1 1 -1.26506123
Note: I suspect that Hadley Wickham's plyr may provide the needed tool, perhaps daply?
Here is what I'd probably do:
library(abind)
abind(split(df, df$i), along=3)
# , , 1
#
# i x y l
# 5 1 0 0 -0.56047565
# 6 1 0 1 -0.23017749
# 7 1 1 0 1.55870831
# 8 1 1 1 0.07050839
#
# , , 2
#
# i x y l
# 5 2 0 0 0.1292877
# 6 2 0 1 1.7150650
# 7 2 1 0 0.4609162
# 8 2 1 1 -1.2650612
It sounds like you are looking for split
:
> split(df, df$i)
$`1`
i x y l
1 1 0 0 -0.56047565
2 1 0 1 -0.23017749
3 1 1 0 1.55870831
4 1 1 1 0.07050839
$`2`
i x y l
5 2 0 0 0.1292877
6 2 0 1 1.7150650
7 2 1 0 0.4609162
8 2 1 1 -1.2650612
This results in a list
of two data.frame
s separated by your "i" column.
To get an array
, you've got Josh's answer, or you can use simplify2array
from base R:
> simplify2array(by(df, df$i, as.matrix))
, , 1
i x y l
1 1 0 0 -0.56047565
2 1 0 1 -0.23017749
3 1 1 0 1.55870831
4 1 1 1 0.07050839
, , 2
i x y l
1 2 0 0 0.1292877
2 2 0 1 1.7150650
3 2 1 0 0.4609162
4 2 1 1 -1.2650612
Maybe I'm reading the question wrong, but the MWE describes a 2x2x2 array (x, y, i (a.k.a. z)). The current answers appear to provide solutions that provide arrays of data.frames rather than arrays of 2D matrices (per OP). array()
will convert a data.frame
to an array of n-dimensional matrices:
dfa <- array(data = df$l,
dim=c(length(unique(df$x)),
length(unique(df$y)),
length(unique(df$i))),
dimnames=list(unique(df$x), unique(df$y), unique(df$i))
)
dfa
> dfa
, , 1
0 1
0 -0.5604756 1.55870831
1 -0.2301775 0.07050839
, , 2
0 1
0 0.1292877 0.4609162
1 1.7150650 -1.2650612