How to get summary statistics by group

2019-01-01 06:01发布

I'm trying to get multiple summary statistics in R/S-PLUS grouped by categorical column in one shot. I found couple of functions, but all of them do one statistic per call, like `aggregate().

data <- c(62, 60, 63, 59, 63, 67, 71, 64, 65, 66, 68, 66, 
          71, 67, 68, 68, 56, 62, 60, 61, 63, 64, 63, 59)
grp <- factor(rep(LETTERS[1:4], c(4,6,6,8)))
df <- data.frame(group=grp, dt=data)
mg <- aggregate(df$dt, by=df$group, FUN=mean)    
mg <- aggregate(df$dt, by=df$group, FUN=sum)    

What I'm looking for is to get multiple statistics for the same group like mean, min, max, std, ...etc in one call, is that doable?

标签: r s
9条回答
柔情千种
2楼-- · 2019-01-01 06:14

There's many different ways to go about this, but I'm partial to describeBy in the psych package:

describeBy(df$dt, df$group, mat = TRUE) 
查看更多
余生无你
3楼-- · 2019-01-01 06:14

take a look at the plyr package. Specifically, ddply

ddply(df, .(group), summarise, mean=mean(dt), sum=sum(dt))
查看更多
谁念西风独自凉
4楼-- · 2019-01-01 06:20

Using Hadley Wickham's purrr package this is quite simple. Use split to split the passed data_frame into groups, then use map to apply the summary function to each group.

library(purrr)

df %>% split(.$group) %>% map(summary)
查看更多
登录 后发表回答