I know that there are many answers provided in this forum on how to get summary statistics (e.g. mean, se, N) for multiple groups using options like aggregate
, ddply
or data.table
. I'm not sure, however, how to apply these functions over multiple columns at once.
More specifically, I would like to know how to extend the following ddply
command over multiple columns (dv1, dv2, dv3) without re-typing the code with different variable name each time.
library(reshape2)
library(plyr)
group1 <- c(rep(LETTERS[1:4], c(4,6,6,8)))
group2 <- c(rep(LETTERS[5:8], c(6,4,8,6)))
group3 <- c(rep(LETTERS[9:10], c(12,12)))
my.dat <- data.frame(group1, group2, group3, dv1=rnorm(24),dv2=rnorm(24),dv3=rnorm(24))
my.dat
data1 <- ddply(my.dat, c("group1", "group2","group3"), summarise,
N = length(dv1),
mean = mean(dv1,na.rm=T),
sd = sd(dv1,na.rm=T),
se = sd / sqrt(N)
)
data1
How can I apply this ddply
function over multiple columns such that the outcome will be data1, data2, data3... for each outcome variable? I thought this could be the solution:
dfm <- melt(my.dat, id.vars = c("group1", "group2","group3"))
lapply(list(.(group1, variable), .(group2, variable),.(group3, variable)),
ddply, .data = dfm, .fun = summarize,
mean = mean(value),
sd = sd(value),
N=length(value),
se=sd/sqrt(N))
Looks like it's in the right direction but not exactly what I need. This solution provides the statistics by each group separately. What I need an outcome as in data1 (e.g. first aggregated group is people who are at A, E and I; the second is those who are at group B, E and I etc...)
If you don't want to
melt
into long format, you can also do:which gives:
Here is a solution using
dplyr
. This gives the result in a "wide" format (i.e. the stats for dv1, dv2, dv3 are on the same line).If having the stats for dv1, dv2, and dv3 on separate lines is desired, this can be modified using
melt
orgather
(fromtidyr
).Here's an illustration of reshaping your data first. I've written a custom function to improve readability:
Or without the custom function, thanks to @Jaap