Reshape data from long to semi-wide in R

2020-03-25 06:04发布

问题:

I have data in which each participant made 3 judgments on each of 9 objects (27 judgments). The 9 objects varied in a 3x3 design (within subjects) so there are 2 factors.

I'm starting with ID + 27 data columns, and I need to have

  • ID
  • 2 factor columns: Performance, Situation
  • 3 value columns: Success, ProbAdmit, Admit

I have read the manuals on reshape() and melt() and cast() but haven't yet been able to figure out what I need to do to make it happen. Here is my current progress from which you can see my actual data.

scsc3 <- read.csv("http://swift.cbdr.cmu.edu/data/SCSC3-2006-10-10.csv")
library(reshape)
scsc3.long <- melt(scsc3,id="Participant")
scsc3.long <- cbind(scsc3.long,colsplit(scsc3.long$variable,split="[.]",names=c("Item","Candidate","Performance","Situation")))
scsc3.long$variable <- NULL
scsc3.long$Candidate <- NULL

The above code leaves me with this:

Participant  value  Item      Performance  Situation
4001         5.0    Success   GL           IL
4001         60     ProbAdmit GL           IL
4001         1      Admit     GL           IL
4002         ....

What I need is a dataframe like this

Participant Performance  Situation SuccessValue ProbAdmitValue AdmitValue
4001        GL           IL        5.0          60             1
...

Thanks!

回答1:

Try this:

require(reshape2)
> dcast(scsc3.long, 
        Participant + Performance + Situation ~ Item, 
        value_var = 'value' )

  Participant Performance Situation Admit ProbAdmit Success
1        4001          GH        IH     1       100       7
2        4001          GH        IL     1        50       5
3        4001          GH        IM     1        60       5
4        4001          GL        IH     0        40       3
5        4001          GL        IL     0         0       2
6        4001          GL        IM     0        40       4
...

One way to think of what dcast is doing is: "cast" the data-frame into a wide format where the rows are combinations of Participant + Performance + Situation and the columns are the different possible values of Item, i.e. Admit, ProbAdmit, Success. The value_var = 'value' indicates that the entries of the value column should be displayed, for each "Row-Column" combination.