0

我想测试一组数据的独立性,可重复的示例如下:

income <- c("q1","q2","q3","q4","q5","q1","q2","q3","q4","q5","q1","q2","q3","q4","q5","q1","q2","q3","q4","q5")
v1 <- as.numeric(round(runif(20,40,60),2))
v2 <- as.numeric(round(runif(20,10,20),2))
v3 <- as.numeric(round(runif(20,100,200),2))
v4 <- as.numeric(round(runif(20,0,20),2))

df <- as.data.frame(cbind(income,v1,v2,v3,v4))

    income    v1    v2     v3    v4
1       q1 47.78  18.7 148.75 14.15
2       q2 59.22 19.95 141.65  2.63
3       q3 58.34 14.96 169.94    20
4       q4 40.35 12.28 143.82 12.14
5       q5 59.72 17.14 191.72 10.66
6       q1 59.44 10.32 128.23     1
7       q2 47.65 13.87 187.51  5.74
...

我想测试不同收入组(q1-q5)之间v1、v2、v3和v4的独立性

它应该像

income     v1           v2          v3          v4        p-value
  q1    mean.v1.q1  mean.v2.q1  mean.v3.q1  mean.v4.q1
  q2    mean.v1.q2  mean.v2.q2  mean.v3.q2  mean.v4.q2
  q3    mean.v1.q3  mean.v2.q3  mean.v3.q3  mean.v4.q3
  q4    mean.v1.q4  mean.v2.q4  mean.v3.q4  mean.v4.q4
  q5    mean.v1.q5  mean.v2.q5  mean.v3.q5  mean.v4.q5

我想我应该应用 ANOVA 来获得测试结果,但我不确定如何。任何人都可以帮忙吗?

我想出了下面的脚本,这是正确的方法吗?有什么需要改进的吗?谢谢!

v1mean <- as.data.frame(tapply(v1,income,mean))
colnames(v1mean) <- "v1"
v2mean <-  as.data.frame(tapply(v2,income,mean))
colnames(v2mean) <- "v2"
v3mean <- as.data.frame(tapply(v3,income,mean))
colnames(v3mean) <- "v3"
v4mean <- as.data.frame(tapply(v4,income,mean))
colnames(v4mean) <- "v4"

mean <- cbind(income=rownames(v1mean),v1mean,v2mean,v3mean,v4mean)
library(reshape)
mean <- melt(mean,id="income")

aov <- aov(value~variable + income,data=mean)
summary(aov)
4

0 回答 0