ddply 返回一个数据框作为输出,并且假设我正在正确阅读您的问题,这不是您想要的。我相信您想使用一系列数据子集进行一系列 t 检验,因此唯一真正的任务是编译这些子集的列表。拥有它们后,您可以使用类似 lapply() 的函数对列表中的每个子集运行 t 检验。我确信这不是最优雅的解决方案,但一种方法是使用如下函数创建一个独特的颜色对列表:
get.pairs <- function(v){
l <- length(v)
n <- sum(1:l-1)
a <- vector("list",n)
j = 1
k = 2
for(i in 1:n){
a[[i]] <- c(v[j],v[k])
if(k < l){
k <- k + 1
} else {
j = j + 1
k = j + 1
}
}
return(a)
}
现在您可以使用该函数来获取您的唯一颜色对列表:
> (color.pairs <- get.pairs(levels(diam$color))))
[[1]]
[1] "D" "E"
[[2]]
[1] "D" "F"
...
[[21]]
[1] "I" "J"
现在,您可以使用这些列表中的每一个在数据框的子集上运行 t.test(或任何您想要的),如下所示:
> t.test(price~cut,data=diam[diam$color %in% color.pairs[[1]],])
Welch Two Sample t-test
data: price by cut
t = 8.1594, df = 427.272, p-value = 3.801e-15
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
1008.014 1647.768
sample estimates:
mean in group Fair mean in group Ideal
3938.711 2610.820
现在使用 lapply() 对颜色对列表中的每个子集运行测试:
> lapply(color.pairs,function(x) t.test(price~cut,data=diam[diam$color %in% x,]))
[[1]]
Welch Two Sample t-test
data: price by cut
t = 8.1594, df = 427.272, p-value = 3.801e-15
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
1008.014 1647.768
sample estimates:
mean in group Fair mean in group Ideal
3938.711 2610.820
...
[[21]]
Welch Two Sample t-test
data: price by cut
t = 0.8813, df = 375.996, p-value = 0.3787
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-260.0170 682.3882
sample estimates:
mean in group Fair mean in group Ideal
4802.912 4591.726