Convert data frame common rows to columns

2019-01-20 04:12发布

问题:

Say I have a data frame where one column is some repeating value (dates, IDs, etc). Is there a way to convert a data frame into a now data frame with columns instead of replicating rows? Basically I want to transpose something like this:

col1 col2 col3
1    aa   30
2    aa   40
3    aa   10
1    bb   20
2    bb   12
3    bb   15
1    cc   40
2    cc   31
3    cc   12

Into this:

    aa   bb   cc
1   30   20   40
2   40   12   31
3   10   15   12

Here is some code that makes a sample of the first data frame:

a <- c(rep(1:10, 3))
b <- c(rep("aa", 10), rep("bb", 10), rep("cc", 10))
set.seed(123)
c <- sample(seq(from = 20, to = 50, by = 5), size = 30, replace = TRUE)
d <- data.frame(a,b, c)

I am unsure how to transpose it.

回答1:

a <- c(rep(1:10, 3))
b <- c(rep("aa", 10), rep("bb", 10), rep("cc", 10))
set.seed(123)
c <- sample(seq(from = 20, to = 50, by = 5), size = 30, replace = TRUE)
d <- data.frame(a,b, c)
#how to transpose it#
e<-reshape(d,idvar='a',timevar='b',direction='wide')
e


回答2:

This is also a case in which you can use unstack:

unstack(d, c ~ b)
#    aa bb cc
# 1  30 50 50
# 2  45 35 40
# 3  30 40 40
# 4  50 40 50
# 5  50 20 40
# 6  20 50 40
# 7  35 25 35
# 8  50 20 40
# 9  35 30 30
# 10 35 50 25


回答3:

Using your data frame d,

library(tidyr)
> spread(d, key = b, value = c)
    a aa bb cc
1   1 30 50 50
2   2 45 35 40
3   3 30 40 40
4   4 50 40 50
5   5 50 20 40
6   6 20 50 40
7   7 35 25 35
8   8 50 20 40
9   9 35 30 30
10 10 35 50 25

Explanation, the argument key = b lets you specify a column in your data frame. spread will create a new column for each unique entry in the key column b. The argument value = c tells spread to retrieve the value in column c and write it in the corresponding new key column.



回答4:

If there are always equal numbers of observations in each group, this would be very easy with split then as.data.frame

as.data.frame(split(d$c, d$b))
#    aa bb cc
# 1  30 50 50
# 2  45 35 40
# 3  30 40 40
# 4  50 40 50
# 5  50 20 40
# 6  20 50 40
# 7  35 25 35
# 8  50 20 40
# 9  35 30 30
# 10 35 50 25


回答5:

With split and cbind:

> ll = lapply(split(d, d$b), function(x) x[3])
> dd = do.call(cbind, ll)
> names(dd) = names(ll)
> dd
   aa bb cc
1  30 50 50
2  45 35 40
3  30 40 40
4  50 40 50
5  50 20 40
6  20 50 40
7  35 25 35
8  50 20 40
9  35 30 30
10 35 50 25


标签: r reshape