Make rows with multiple observations into columns

2019-06-07 07:02发布

问题:

I was looking at similar questions but I couldn't find a case similar to mine. I have a data frame that for each subject, has multiple observations per condition. It looks like this:

subject   <- c(rep("S1",4), rep("S2",4))
condition <- rep(c(rep("a",2), rep("b",2)),2)
value     <- c(1:8)
df        <- data.frame(subject,condition,value)

df
  subject condition value
  S1         a     1
  S1         a     2
  S1         b     3
  S1         b     4
  S2         a     5
  S2         a     6
  S2         b     7
  S2         b     8

I would like to reshape it to look like this:

  subject condition.a condition.b
  S1           1           3
  S1           2           4
  S2           5           7
  S2           6           8

I have tried reshape and cast, but they give me an error message because there are multiple observations per subject and condition. Does anyone have suggestions on how they would do this?

Thanks!

回答1:

The question is slightly ambiguous in the sense that it is not clear which values of a and b should be coupled with each other.

Assuming that you want the 1st value for (S1, a) to couple with the first (S2, b) and so on, then you can add a dummy count column which counts the various occurrences of (subject, condition) and then use the count_id to melt and dcast the data like this:

library(plyr)
library(reshape2)

subject   <- c(rep("S1",4), rep("S2",4))
condition <- rep(c(rep("a",2), rep("b",2)),2)
value     <- c(1:8)
df        <- data.frame(subject,condition,value)

df.2 <- ddply(df, .(subject, condition), function(x) { x$count <- 1:nrow(x); x })
df.2
#   subject condition value count
# 1      S1         a     1     1
# 2      S1         a     2     2
# 3      S1         b     3     1
# 4      S1         b     4     2
# 5      S2         a     5     1
# 6      S2         a     6     2
# 7      S2         b     7     1
# 8      S2         b     8     2

df.3 <- melt(df.2, id.vars=c('subject', 'condition', 'count'))
dcast(df.3, subject + count ~ condition)
#   subject count a b
# 1      S1     1 1 3
# 2      S1     2 2 4
# 3      S2     1 5 7
# 4      S2     2 6 8

Is this what you wanted?