create a new column in a data.table from group by

2019-08-20 16:54发布

I'm working on a data.table that includes X and Y columns and I want to create a new column Z which is the number of all records with the same value of (X, Y).

I know the syntax when working with a data.frame:

ddply(df,.(X,Y),nrow)

I tested different syntaxes I found on this forum but they didn't work:

dt[, Z := lapply(.SD,nrow), by="X,Y"] # or   
dt[, `:=`(Z = lapply(.SD,nrow)), by="X,Y"]   

I precise X and Y are numeric.

1条回答
劫难
2楼-- · 2019-08-20 17:27

Starting from

library(data.table)
dt <- data.table(X = c(1, 1, 2), Y = c(1, 1, 2))

The appropriate syntax is

dt[, Z := .N, by = c("X","Y")]

or

dt[, Z := .N, by = .(X,Y)]
查看更多
登录 后发表回答