I want to reshape a wide format dataset that has multiple tests which are measured at 3 time points:
ID Test Year Fall Spring Winter
1 1 2008 15 16 19
1 1 2009 12 13 27
1 2 2008 22 22 24
1 2 2009 10 14 20
2 1 2008 12 13 25
2 1 2009 16 14 21
2 2 2008 13 11 29
2 2 2009 23 20 26
3 1 2008 11 12 22
3 1 2009 13 11 27
3 2 2008 17 12 23
3 2 2009 14 9 31
into a data set that separates the tests by column but converts the measurement time into long format, for each of the new columns like this:
ID Year Time Test1 Test2
1 2008 Fall 15 22
1 2008 Spring 16 22
1 2008 Winter 19 24
1 2009 Fall 12 10
1 2009 Spring 13 14
1 2009 Winter 27 20
2 2008 Fall 12 13
2 2008 Spring 13 11
2 2008 Winter 25 29
2 2009 Fall 16 23
2 2009 Spring 14 20
2 2009 Winter 21 26
3 2008 Fall 11 17
3 2008 Spring 12 12
3 2008 Winter 22 23
3 2009 Fall 13 14
3 2009 Spring 11 9
3 2009 Winter 27 31
I have unsuccessfully tried to use reshape and melt. Existing posts address transforming to single column outcome.
Sticking with base R, this is another good candidate for the "
stack
+reshape
" routine. Assuming our dataset is called "mydf":Base
reshape
function alternative method is below. Though this required usingreshape
twice, there might be a simpler way.Assuming your dataset is called
df1
Which gives:
Using
reshape2
:Update: Using data.table melt/cast from versions >= 1.9.0:
data.table
from versions 1.9.0 importsreshape2
package and implements fastmelt
anddcast
methods in C for data.tables. A comparison of speed on bigger data is shown below.For more info regarding NEWS, go here.
At the moment, you'll have to write
dcast.data.table
explicitly as it's not a S3 generic inreshape2
yet.Benchmarking on bigger data:
reshape2 timings:
data.table timings:
melt.data.table
is ~175x faster thanreshape2:::melt
anddcast.data.table
is ~5x thanreshape2:::dcast
.tidyverse
/tidyr
solution: