I have asked this question earlier and received a reply which was not in accordance with my wish. At the time I used stata to do the job. However as I routinely work with such data, I wish to use R to create what I wanted. I have a data set of daily hospital admission by age, sex and diagnoses. I wish to aggregate and reshape the data from long to wide. How could I achieve this objective? Sample data and required output are shown below. The column headers designate prefix of sex, age and diagnoses. Thanks
Sample data
structure(list(diag = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L), .Label = c("card", "cere"), class = "factor"), sex = structure(c(1L,
1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L,
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("Female", "Male"), class = "factor"),
age = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("35-64",
"65-74"), class = "factor"), admissions = c(1L, 1L, 0L, 0L,
6L, 6L, 6L, 1L, 4L, 0L, 0L, 0L, 4L, 6L, 5L, 2L, 2L, 4L, 1L,
0L, 6L, 5L, 6L, 4L), bdate = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L), .Label = c("1987-01-01", "1987-01-02",
"1987-01-03"), class = "factor")), .Names = c("diag", "sex",
"age", "admissions", "bdate"), row.names = c(NA, -24L), class = "data.frame")
Required output
structure(list(date = structure(1:3, .Label = c("01jan1987",
"02jan1987", "03jan1987"), class = "factor"), f3564card = c(1L,
4L, 2L), f6574card = c(1L, 0L, 4L), m3564card = c(0L, 0L, 1L),
m6574card = c(0L, 0L, 0L), f3564cere = c(6L, 4L, 6L), f6574cere = c(6L,
6L, 5L), m3564cere = c(6L, 5L, 6L), m6574cere = c(1L, 2L,
4L)), .Names = c("date", "f3564card", "f6574card", "m3564card",
"m6574card", "f3564cere", "f6574cere", "m3564cere", "m6574cere"
), class = "data.frame", row.names = c(NA, -3L))
Your data are already in a long format that can be used easily by "reshape2", like this:
I don't see any aggregation in your sample output, but if aggregation is required, you can achieve this with the
fun.aggregate
function withindcast
.Consdering that cvd and ACS are not mutually exclusive to males and females respectively,