How to use aggregate with a list of column names

2019-06-22 15:45发布

How do you abstract aggregate in a function by passing a list of conditions and values to summarize?

# This works fine:
x <- data.frame(cond1 = sample(letters[1:3], 500, replace=TRUE), 
                cond2 = sample(LETTERS[1:7], 500, replace = TRUE), 
                cond3 = sample(LETTERS[1:4], 500, replace = TRUE), 
                value1 = rnorm(500), 
                value2 = rnorm(500))

aggregate(cbind(value1,value2) ~ cond1 + cond2, data = x, FUN=sum)

Need to create a list of column names: (3 options shown) then call the function:

c1 <- c("cond1","cond2","cond3"); v1 <- c("value1","value2")
c1 <- c("cond2","cond3");         v1 <- c("value2")
c1 <- c("cond3");                 v1 <- c("value1")

aggregate(cbind(v1) ~ c1, data = x, FUN=sum)

I have reviewed many alternatives, but have not yet discovered the key to this abstraction.

标签: r aggregate
2条回答
Evening l夕情丶
2楼-- · 2019-06-22 16:41

You can use the alternative interface to aggregate, which does not use a formula:

c1 <- c("cond1","cond2","cond3")
v1 <- c("value1","value2")
aggregate(x[v1],by=x[c1],FUN=sum)

   cond1 cond2 cond3     value1      value2
1      a     A     A -3.3025839 -0.98304649
2      b     A     A  0.6326985 -3.08677485
3      c     A     A  3.6007853  2.23962265
4      a     B     A -0.5247620 -0.94644740
5      b     B     A  0.9242562  2.48268452
6      c     B     A  6.9215712  0.31512645
查看更多
男人必须洒脱
3楼-- · 2019-06-22 16:49
c1 <- list( c("cond1","cond2","cond3"), c("cond2","cond3"),c("cond3"))
v1 <- list( c("value1","value2"),c("value2"),c("value1"))

 mapply(FUN= function(z,y, ...) {aggregate(x[ , y], by=x[z], ...)},
           c1, v1, MoreArgs=list(FUN=sum) )

Result is a list of three dataframes

[[1]]
   cond1 cond2 cond3      value1      value2
1      a     A     A  0.19396539  1.11536490
2      b     A     A -1.20056699 -5.36713982
3      c     A     A -0.19716521 -2.06737461
4      a     B     A  1.58880450 -7.62452134
5      b     B     A -4.68579210  0.47266047
6      c     B     A  2.70550795 -0.50020883
7      a     C     A  1.69312219 -4.26851536
8      b     C     A  0.99236424  4.85013434
snipped remaining 76 rows

[[2]]
   cond2 cond3           x
1      A     A -6.31914953
2      B     A -7.65206970
3      C     A  1.36818527
4      D     A  3.77492482
5      E     A  2.68977303
snipped 23 rows

[[3]]
  cond3          x
1     A   8.104481
2     B  17.766659
3     C -14.577315
4     D   4.398249
查看更多
登录 后发表回答