是否有可能包括单tapply或集合声明中的两个功能呢?
下面我用两个tapply语句和两个汇总报表:一个平均值和一个SD。
我宁愿陈述结合起来。
my.Data = read.table(text = "
animal age sex weight
1 adult female 100
2 young male 75
3 adult male 90
4 adult female 95
5 young female 80
", sep = "", header = TRUE)
with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x)}))
with(my.Data, tapply(weight, list(age, sex), function(x) {sd(x) }))
with(my.Data, aggregate(weight ~ age + sex, FUN = mean)
with(my.Data, aggregate(weight ~ age + sex, FUN = sd)
# this does not work:
with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x) ; sd(x)}))
# I would also prefer that the output be formatted something similar to that
# show below. `aggregate` formats the output perfectly. I just cannot figure
# out how to implement two functions in one statement.
age sex mean sd
adult female 97.5 3.535534
adult male 90 NA
young female 80.0 NA
young male 75 NA
我总是可以运行两个单独的报表和合并输出。 我只是希望有可能是一个稍微更方便的解决方案。
我发现下面张贴在这里的答案: 使用tapply应用多种功能列
f <- function(x) c(mean(x), sd(x))
do.call( rbind, with(my.Data, tapply(weight, list(age, sex), f)) )
然而,无论是行或列的标签。
[,1] [,2]
[1,] 97.5 3.535534
[2,] 80.0 NA
[3,] 90.0 NA
[4,] 75.0 NA
我宁愿在基R.从A溶液的溶液plyr
包在链路被张贴的上方。 如果我可以添加正确的行和列标题,以上面的输出,这将是完美的。
但这些应该有:
with(my.Data, aggregate(weight, list(age, sex), function(x) { c(MEAN=mean(x), SD=sd(x) )}))
with(my.Data, tapply(weight, list(age, sex), function(x) { c(mean(x) , sd(x) )} ))
# Not a nice structure but the results are in there
with(my.Data, aggregate(weight ~ age + sex, FUN = function(x) c( SD = sd(x), MN= mean(x) ) ) )
age sex weight.SD weight.MN
1 adult female 3.535534 97.500000
2 young female NA 80.000000
3 adult male NA 90.000000
4 young male NA 75.
该原则被遵守是让你的函数返回“一件事”,这可以是一个向量或列表,但不能是两个函数调用连续调用。
如果您想使用data.table,它具有with
和by
内置到它:
library(data.table)
myDT <- data.table(my.Data, key="animal")
myDT[, c("mean", "sd") := list(mean(weight), sd(weight)), by=list(age, sex)]
myDT[, list(mean_Aggr=sum(mean(weight)), sd_Aggr=sum(sd(weight))), by=list(age, sex)]
age sex mean_Aggr sd_Aggr
1: adult female 96.0 3.6055513
2: young male 76.5 2.1213203
3: adult male 91.0 1.4142136
4: young female 84.5 0.7071068
我用稍微不同的数据集,从而不具有NA
为SD值
在分享的精神, 如果你熟悉SQL,你也可以考虑与“sqldf”包。 (着重强调,因为你需要知道的,例如,这mean
是avg
,以得到你想要的结果。)
sqldf("select age, sex,
avg(weight) `Wt.Mean`,
stdev(weight) `Wt.SD`
from `my.Data`
group by age, sex")
age sex Wt.Mean Wt.SD
1 adult female 97.5 3.535534
2 adult male 90.0 0.000000
3 young female 80.0 0.000000
4 young male 75.0 0.000000
整形可以让你通过2层的功能; reshape2没有。
library(reshape)
my.Data = read.table(text = "
animal age sex weight
1 adult female 100
2 young male 75
3 adult male 90
4 adult female 95
5 young female 80
", sep = "", header = TRUE)
my.Data[,1]<- NULL
(a1<- melt(my.Data, id=c("age", "sex"), measured=c("weight")))
(cast(a1, age + sex ~ variable, c(mean, sd), fill=NA))
# age sex weight_mean weight_sd
# 1 adult female 97.5 3.535534
# 2 adult male 90.0 NA
# 3 young female 80.0 NA
# 4 young male 75.0 NA
我欠这@Ramnath,谁指出这只是昨天。