在一个tapply或聚合语句中的多个功能(Multiple functions in a singl

2019-07-21 00:17发布

是否有可能包括单tapply或集合声明中的两个功能呢?

下面我用两个tapply语句和两个汇总报表:一个平均值和一个SD。
我宁愿陈述结合起来。

my.Data = read.table(text = "
  animal    age     sex  weight
       1  adult  female     100
       2  young    male      75
       3  adult    male      90
       4  adult  female      95
       5  young  female      80
", sep = "", header = TRUE)

with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x)}))
with(my.Data, tapply(weight, list(age, sex), function(x) {sd(x)  }))

with(my.Data, aggregate(weight ~ age + sex, FUN = mean)
with(my.Data, aggregate(weight ~ age + sex, FUN =   sd)

# this does not work:

with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x) ; sd(x)}))

# I would also prefer that the output be formatted something similar to that 
# show below.  `aggregate` formats the output perfectly.  I just cannot figure 
# out how to implement two functions in one statement.

  age    sex   mean        sd
adult female   97.5  3.535534
adult   male     90        NA
young female   80.0        NA
young   male     75        NA

我总是可以运行两个单独的报表和合并输出。 我只是希望有可能是一个稍微更方便的解决方案。

我发现下面张贴在这里的答案: 使用tapply应用多种功能列

f <- function(x) c(mean(x), sd(x))
do.call( rbind, with(my.Data, tapply(weight, list(age, sex), f)) )

然而,无论是行或列的标签。

     [,1]     [,2]
[1,] 97.5 3.535534
[2,] 80.0       NA
[3,] 90.0       NA
[4,] 75.0       NA

我宁愿在基R.从A溶液的溶液plyr包在链路被张贴的上方。 如果我可以添加正确的行和列标题,以上面的输出,这将是完美的。

Answer 1:

但这些应该有:

with(my.Data, aggregate(weight, list(age, sex), function(x) { c(MEAN=mean(x), SD=sd(x) )}))

with(my.Data, tapply(weight, list(age, sex), function(x) { c(mean(x) , sd(x) )} ))
# Not a nice structure but the results are in there

with(my.Data, aggregate(weight ~ age + sex, FUN =  function(x) c( SD = sd(x), MN= mean(x) ) ) )
    age    sex weight.SD weight.MN
1 adult female  3.535534 97.500000
2 young female        NA 80.000000
3 adult   male        NA 90.000000
4 young   male        NA 75.

该原则被遵守是让你的函数返回“一件事”,这可以是一个向量或列表,但不能是两个函数调用连续调用。



Answer 2:

如果您想使用data.table,它具有withby内置到它:

library(data.table)
myDT <- data.table(my.Data, key="animal")


myDT[, c("mean", "sd") := list(mean(weight), sd(weight)), by=list(age, sex)]


myDT[, list(mean_Aggr=sum(mean(weight)), sd_Aggr=sum(sd(weight))), by=list(age, sex)]
     age    sex mean_Aggr   sd_Aggr
1: adult female     96.0  3.6055513
2: young   male     76.5  2.1213203
3: adult   male     91.0  1.4142136
4: young female     84.5  0.7071068

我用稍微不同的数据集,从而不具有NA为SD值



Answer 3:

在分享的精神, 如果你熟悉SQL,你也可以考虑与“sqldf”包。 (着重强调,因为你需要知道的,例如,这meanavg ,以得到你想要的结果。)

sqldf("select age, sex, 
      avg(weight) `Wt.Mean`, 
      stdev(weight) `Wt.SD` 
      from `my.Data` 
      group by age, sex")
    age    sex Wt.Mean    Wt.SD
1 adult female    97.5 3.535534
2 adult   male    90.0 0.000000
3 young female    80.0 0.000000
4 young   male    75.0 0.000000


Answer 4:

整形可以让你通过2层的功能; reshape2没有。

library(reshape)
my.Data = read.table(text = "
  animal    age     sex  weight
       1  adult  female     100
       2  young    male      75
       3  adult    male      90
       4  adult  female      95
       5  young  female      80
", sep = "", header = TRUE)
my.Data[,1]<- NULL
(a1<-  melt(my.Data, id=c("age", "sex"), measured=c("weight")))
(cast(a1, age + sex ~ variable, c(mean, sd), fill=NA))

#     age    sex weight_mean weight_sd
# 1 adult female        97.5  3.535534
# 2 adult   male        90.0        NA
# 3 young female        80.0        NA
# 4 young   male        75.0        NA

我欠这@Ramnath,谁指出这只是昨天。



文章来源: Multiple functions in a single tapply or aggregate statement