Is it possible to include two functions within a single tapply or aggregate statement?
Below I use two tapply statements and two aggregate statements: one for mean and one for SD.
I would prefer to combine the statements.
my.Data = read.table(text = "
animal age sex weight
1 adult female 100
2 young male 75
3 adult male 90
4 adult female 95
5 young female 80
", sep = "", header = TRUE)
with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x)}))
with(my.Data, tapply(weight, list(age, sex), function(x) {sd(x) }))
with(my.Data, aggregate(weight ~ age + sex, FUN = mean)
with(my.Data, aggregate(weight ~ age + sex, FUN = sd)
# this does not work:
with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x) ; sd(x)}))
# I would also prefer that the output be formatted something similar to that
# show below. `aggregate` formats the output perfectly. I just cannot figure
# out how to implement two functions in one statement.
age sex mean sd
adult female 97.5 3.535534
adult male 90 NA
young female 80.0 NA
young male 75 NA
I can always run two separate statements and merge the output. I was just hoping there might be a slightly more convenient solution.
I found the answer below posted here: Apply multiple functions to column using tapply
f <- function(x) c(mean(x), sd(x))
do.call( rbind, with(my.Data, tapply(weight, list(age, sex), f)) )
However, neither the rows or columns are labeled.
[,1] [,2]
[1,] 97.5 3.535534
[2,] 80.0 NA
[3,] 90.0 NA
[4,] 75.0 NA
I would prefer a solution in base R. A solution from the plyr
package was posted at the link above. If I can add the correct row and column headings to the above output, it would be perfect.
Reshape lets you pass 2 functions; reshape2 does not.
I owe this to @Ramnath, who noted this just yesterday.
If you'd like to use data.table, it has
with
andby
built right into it:I used a slightly different data set so as to not have
NA
values for sdIn the spirit of sharing, if you are familiar with SQL, you might also consider the "sqldf" package. (Emphasis added because you do need to know, for instance, that
mean
isavg
in order to get the results you want.)But these should have:
The principle to be adhered to is to have your function return "one thing" which could be either a vector or a list but cannot be the successive invocation of two function calls.