stacking columns in data.frame into one column in

2019-02-09 20:10发布

问题:

I'm having trouble stacking columns in a data.frame into one column. Now my data looks something like this:

id   time    black   white   red 
a     1       b1      w1     r1
a     2       b2      w2     r2
a     3       b3      w3     r3
b     1       b4      w4     r4
b     2       b5      w5     r5
b     3       b6      w6     r6

I'm trying to transform the data.frame so that it looks like this:

id   time  colour 
a     1     b1
a     2     b2
a     3     b3
b     1     b4
b     2     b5
b     3     b6
a     1     w1
a     2     w2
a     3     w3
b     1     w4
b     2     w5
b     3     w6
a     1     r1
a     2     r2
.     .     .
.     .     .
.     .     .

I'm guessing that this problem requires using the reshape package, but I'm not exactly sure how to use it to stack multiple columns under one column. Can anyone provide help on this?

回答1:

Here's melt from reshape:

library(reshape)
melt(x, id.vars=c('id', 'time'),var='color')

And using reshape2 (an up-to-date, faster version of reshape) the syntax is almost identical.

The help files have useful examples (see ?melt and the link to melt.data.frame).

In your case, something like the following will work (assuming your data.frame is called DF)

library(reshape2)
melt(DF, id.var = c('id','time'), variable.name = 'colour')


回答2:

Since you mention "stacking" in your title, you can also look at the stack function in base R:

cbind(mydf[1:2], stack(mydf[3:5]))
#    id time values   ind
# 1   a    1     b1 black
# 2   a    2     b2 black
# 3   a    3     b3 black
# 4   b    1     b4 black
# 5   b    2     b5 black
# 6   b    3     b6 black
# 7   a    1     w1 white
# 8   a    2     w2 white
# 9   a    3     w3 white
# 10  b    1     w4 white
# 11  b    2     w5 white
# 12  b    3     w6 white
# 13  a    1     r1   red
# 14  a    2     r2   red
# 15  a    3     r3   red
# 16  b    1     r4   red
# 17  b    2     r5   red
# 18  b    3     r6   red

If the values in the "black", "white", and "red" columns are factors, you'll need to convert them to character values first.

cbind(mydf[1:2], stack(lapply(mydf[3:5], as.character)))