Count number of rows within each group-第2页回答

I have a dataframe and I would like to count the number of rows within each group. I reguarly use the aggregate function to sum data as follows:

df2 <- aggregate(x ~ Year + Month, data = df1, sum)

Now, I would like to count observations but can't seem to find the proper argument for FUN. Intuitively, I thought it would be as follows:

df2 <- aggregate(x ~ Year + Month, data = df1, count)

But, no such luck.

Any ideas?

Some toy data:

set.seed(2)
df1 <- data.frame(x = 1:20,
                  Year = sample(2012:2014, 20, replace = TRUE),
                  Month = sample(month.abb[1:3], 20, replace = TRUE))

标签： r dataframe r-faq

12条回答

谁念西风独自凉

2楼-- · 2018-12-31 03:26

Following @Joshua's suggestion, here's one way you might count the number of observations in your df dataframe where Year = 2007 and Month = Nov (assuming they are columns):

nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])

and with aggregate, following @GregSnow:

aggregate(x ~ Year + Month, data = df, FUN = length)

0人赞添加讨论(0) 举报

牵手、夕阳

3楼-- · 2018-12-31 03:28

An alternative to the aggregate() function in this case would be table() with as.data.frame(), which would also indicate which combinations of Year and Month are associated with zero occurrences

df<-data.frame(x=rep(1:6,rep(c(1,2,3),2)),year=1993:2004,month=c(1,1:11))

myAns<-as.data.frame(table(df[,c("year","month")]))

And without the zero-occurring combinations

myAns[which(myAns$Freq>0),]

0人赞添加讨论(0) 举报

骚的不知所云

4楼-- · 2018-12-31 03:31

Considering @Ben answer, R would throw an error if df1 does not contain x column. But it can be solved elegantly with paste:

aggregate(paste(Year, Month) ~ Year + Month, data = df1, FUN = NROW)

Similarly, it can be generalized if more than two variables are used in grouping:

aggregate(paste(Year, Month, Day) ~ Year + Month + Day, data = df1, FUN = NROW)

0人赞添加讨论(0) 举报

ら面具成の殇う

5楼-- · 2018-12-31 03:33

The simple option to use with aggregate is the length function which will give you the length of the vector in the subset. Sometimes a little more robust is to use function(x) sum( !is.na(x) ).

0人赞添加讨论(0) 举报

深知你不懂我心

6楼-- · 2018-12-31 03:33

lw<- function(){length(which(df$variable==someValue))}

agg<- aggregate(Var1~Var2+Var3, data=df, FUN=lw)

names(agg)<- c("Some", "Pretty", "Names", "Here")

View(agg)

0人赞添加讨论(0) 举报

素衣白纱

7楼-- · 2018-12-31 03:36

A sql solution using sqldf package:

library(sqldf)
sqldf("SELECT Year, Month, COUNT(*) as Freq
       FROM df1
       GROUP BY Year, Month")

0人赞添加讨论(0) 举报

上一页 1 2

Count number of rows within each group

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间