16

是否可以在单个 tapply 或聚合语句中包含两个函数?

下面我使用两个 tapply 语句和两个聚合语句:一个用于均值,一个用于 SD。
我更愿意合并这些陈述。

my.Data = read.table(text = "
  animal    age     sex  weight
       1  adult  female     100
       2  young    male      75
       3  adult    male      90
       4  adult  female      95
       5  young  female      80
", sep = "", header = TRUE)

with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x)}))
with(my.Data, tapply(weight, list(age, sex), function(x) {sd(x)  }))

with(my.Data, aggregate(weight ~ age + sex, FUN = mean)
with(my.Data, aggregate(weight ~ age + sex, FUN =   sd)

# this does not work:

with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x) ; sd(x)}))

# I would also prefer that the output be formatted something similar to that 
# show below.  `aggregate` formats the output perfectly.  I just cannot figure 
# out how to implement two functions in one statement.

  age    sex   mean        sd
adult female   97.5  3.535534
adult   male     90        NA
young female   80.0        NA
young   male     75        NA

我总是可以运行两个单独的语句并合并输出。我只是希望可能有一个更方便的解决方案。

我在这里找到了下面的答案:Apply multiple functions to column using tapply

f <- function(x) c(mean(x), sd(x))
do.call( rbind, with(my.Data, tapply(weight, list(age, sex), f)) )

但是,行或列都没有标记。

     [,1]     [,2]
[1,] 97.5 3.535534
[2,] 80.0       NA
[3,] 90.0       NA
[4,] 75.0       NA

我更喜欢 base R 中的解决方案。plyr包中的解决方案已发布在上面的链接中。如果我可以在上面的输出中添加正确的行和列标题,那将是完美的。

4

4 回答 4

18

但这些应该有:

with(my.Data, aggregate(weight, list(age, sex), function(x) { c(MEAN=mean(x), SD=sd(x) )}))

with(my.Data, tapply(weight, list(age, sex), function(x) { c(mean(x) , sd(x) )} ))
# Not a nice structure but the results are in there

with(my.Data, aggregate(weight ~ age + sex, FUN =  function(x) c( SD = sd(x), MN= mean(x) ) ) )
    age    sex weight.SD weight.MN
1 adult female  3.535534 97.500000
2 young female        NA 80.000000
3 adult   male        NA 90.000000
4 young   male        NA 75.

要遵守的原则是让你的函数返回“一个东西”,它可以是向量或列表,但不能是两个函数调用的连续调用。

于 2013-03-05T03:09:32.907 回答
10

如果您想使用 data.table,它具有withby内置于其中:

library(data.table)
myDT <- data.table(my.Data, key="animal")


myDT[, c("mean", "sd") := list(mean(weight), sd(weight)), by=list(age, sex)]


myDT[, list(mean_Aggr=sum(mean(weight)), sd_Aggr=sum(sd(weight))), by=list(age, sex)]
     age    sex mean_Aggr   sd_Aggr
1: adult female     96.0  3.6055513
2: young   male     76.5  2.1213203
3: adult   male     91.0  1.4142136
4: young female     84.5  0.7071068

我使用了稍微不同的数据集,以便没有NAsd 的值

于 2013-03-05T03:26:53.003 回答
7

本着分享的精神,如果您熟悉 SQL,您可能还会考虑“sqldf”包。mean(强调是因为你确实需要知道,例如,avg为了得到你想要的结果。)

sqldf("select age, sex, 
      avg(weight) `Wt.Mean`, 
      stdev(weight) `Wt.SD` 
      from `my.Data` 
      group by age, sex")
    age    sex Wt.Mean    Wt.SD
1 adult female    97.5 3.535534
2 adult   male    90.0 0.000000
3 young female    80.0 0.000000
4 young   male    75.0 0.000000
于 2013-03-05T18:26:29.947 回答
5

Reshape 让你传递 2 个函数;reshape2 没有。

library(reshape)
my.Data = read.table(text = "
  animal    age     sex  weight
       1  adult  female     100
       2  young    male      75
       3  adult    male      90
       4  adult  female      95
       5  young  female      80
", sep = "", header = TRUE)
my.Data[,1]<- NULL
(a1<-  melt(my.Data, id=c("age", "sex"), measured=c("weight")))
(cast(a1, age + sex ~ variable, c(mean, sd), fill=NA))

#     age    sex weight_mean weight_sd
# 1 adult female        97.5  3.535534
# 2 adult   male        90.0        NA
# 3 young female        80.0        NA
# 4 young   male        75.0        NA

这要归功于@Ramnath,他昨天才注意到这一点。

于 2013-03-05T04:10:44.890 回答