When I need to apply multiple functions to multiple columns sequentially and aggregate by multiple columns and want the results to be bound into a data frame I usually use aggregate()
in the following manner:
# bogus functions
foo1 <- function(x){mean(x)*var(x)}
foo2 <- function(x){mean(x)/var(x)}
# for illustration purposes only
npk$block <- as.numeric(npk$block)
subdf <- aggregate(npk[,c("yield", "block")],
by = list(N = npk$N, P = npk$P),
FUN = function(x){c(col1 = foo1(x), col2 = foo2(x))})
Having the results in a nicely ordered data frame is achieved by using:
df <- do.call(data.frame, subdf)
Can I avoid the call to do.call()
by somehow using aggregate()
smarter in this scenario or shorten the whole process by using another base R
solution from the start?