I have one dataframe:
Date area sales
1 201204 shanghai 23
2 201204 beijing 25
3 201204 beijing 16
4 201205 shanghai 55
5 201205 beijing 17
6 201205 shanghai 16
What I want to output is a table as follows:
Date shanghai beijing
201204 23 41
201205 71 17
How would I do this in R?
In base R (for sum
) there's xtabs
:
> xtabs(sales ~ Date + area, mydf)
area
Date beijing shanghai
201204 41 23
201205 17 71
To get it as a data.frame
, wrap it in as.data.frame.matrix
.
To update this with the approach that is making the rounds these days, you can also use a combination of "dplyr" (for aggregation) and "tidyr" (for reshaping), like this:
library(tidyr)
library(dplyr)
mydf %>%
group_by(Date, area) %>%
summarise(sales = sum(sales)) %>%
spread(area, sales)
# Source: local data frame [2 x 3]
#
# Date beijing shanghai
# 1 201204 41 23
# 2 201205 17 71
This is cannon fodder for reshape2::dcast
library(reshape2)
# assuming your data is called `D`
dcast(Date~area, value.var = 'sales', fun.aggregate = sum, data = D)