4

在 R 的tapply函数中,是否有一种简单的方法可以以列表形式输出组合的多个函数(例如meansd)?

也就是说,输出:

tapply(x, factor, mean)
tapply(x, factor, sd)

出现在一个数据框中。

4

4 回答 4

5

以下是两种方法和每种方法的一些变体:

  1. mean在第一种方法中,我们使用一个同时返回和 的函数sd

  2. 在第二种方法中,我们反复调用tapply,一次 for mean ,然后一次 for sd

我们使用了irisR 自带的数据集来运行这段代码:

1)第一个解决方案

# input data
x <- iris$Sepal.Length
factor <- iris$Species

### Solution 1
mean.sd <- function(x) c(mean = mean(x), sd = sd(x))
simplify2array(tapply(x, factor, mean.sd))

这是上述解决方案的两种变体。他们使用相同的tapply结构,但使用do.call. 第一个给出了与上述解决方案类似的结果,第二个是它的转置:

# Solution 1a - use same mean.sd
do.call("rbind", tapply(x, factor, mean.sd))

# Solution 1b - use same mean.sd - result is transposed relative to last two
do.call("cbind", tapply(x, factor, mean.sd))

2)第二种解决方案。这是第二种解决方案,其结果与上述 1 和 1a 相似:

### Solution 2 - orientation is the same as 1 and 1a
mapply(tapply, c(mean = mean, sd = sd), MoreArgs = list(X = x, INDEX = factor))

这与 2 相同,只是我们在末尾将其转置以对应于 1b:

# Solution 2a - same as 2 except orientation is transposed so that it corresponds to 1b
t(mapply(tapply, c(mean = mean, sd = sd), MoreArgs = list(X = x, INDEX = factor)))
于 2013-05-14T15:56:17.720 回答
5
data.frame(rbind(tapply(y, x, mean), tapply(y, x, sd)))

OR

data.frame(cbind(tapply(y, x, mean), tapply(y, x, sd)))

depending on how you'd like them to line up.

Have a safe trip to Stack Overflow!

于 2013-05-14T14:41:23.800 回答
4

这是 plyr 包的示例,

ddply(iris, "Species", summarise, mean=mean(Sepal.Length), sd=sd(Sepal.Length))

     Species  mean        sd
1     setosa 5.006 0.3524897
2 versicolor 5.936 0.5161711
3  virginica 6.588 0.6358796

或者,

ddply(iris, "Species", with, each(mean, sd)(Sepal.Length))

     Species  mean        sd
1     setosa 5.006 0.3524897
2 versicolor 5.936 0.5161711
3  virginica 6.588 0.6358796
于 2013-05-14T16:25:15.883 回答
2

aggregate提供了另一种方式。

x <- 1:3
fac <- c('a', 'a', 'b')
do.call(data.frame, 
        aggregate(x, list(fac), function(y) c(mean=mean(y), sd=sd(y))))

#   Group.1 x.mean   x.sd
# 1       a    1.5 0.7071
# 2       b    3.0     NA

这有助于概括:

fs <- c(mean=mean, sd=sd, median=median)
do.call(data.frame, 
        aggregate(x, list(fac), function(y) sapply(fs, function(f) f(y))))

#   Group.1 x.mean   x.sd x.median
# 1       a    1.5 0.7071      1.5
# 2       b    3.0     NA      3.0
于 2013-05-14T15:39:03.017 回答