r - 怎么做：与“块”（或 - “重复措施”？！）的相关性？

Question

我有以下设置要分析：我们有大约 150 名受试者，对于每个受试者，我们进行了 18 次测试（在不同条件下）。测试的 18 种不同条件是互补的，这样如果我们对测试（每个科目）进行平均，我们将不会得到测试之间（科目之间）的相关性。我们希望知道的是测试之间的相关性（和 P 值），在受试者内部，但在所有受试者中。

我现在这样做的方法是对每个主题执行相关性，然后查看收到的相关性的分布，看看它的平均值是否不同于 0。但我怀疑可能有更好的方法来回答相同的问题问题（有人对我说了一些关于“地理相关性”的事情，但浅浅的搜索并没有帮助）。

ps：我知道这里可能有一个地方可以做某种混合模型，但我更愿意提出一个“相关性”，并且不知道如何从混合模型中提取这样的输出。

此外，这是一个简短的虚拟代码，可以让您了解我在说什么：

attach(longley)
N <- length(Unemployed)
block <- c(
        rep( "a", N),
        rep( "b", N),
        rep( "c", N)
        )

Unemployed.3 <- c(Unemployed + rnorm(1),
                    Unemployed + rnorm(1),
                    Unemployed + rnorm(1))

GNP.deflator.3 <- c(GNP.deflator + rnorm(1),
                    GNP.deflator + rnorm(1),
                    GNP.deflator + rnorm(1))

cor(Unemployed, GNP.deflator)
cor(Unemployed.3, GNP.deflator.3)
cor(Unemployed.3[block == "a"], GNP.deflator.3[block == "a"])
cor(Unemployed.3[block == "b"], GNP.deflator.3[block == "b"])
cor(Unemployed.3[block == "c"], GNP.deflator.3[block == "c"])
(I would like to somehow combine the last three correlations...)

任何想法都会受到欢迎。

最好的，塔尔

score 4 · Accepted Answer

我同意特里斯坦的观点——你正在寻找 ICC。与标准实施的唯一区别是两个评估者（测试）重复评估每个主题。可能有一个实现允许这样做。与此同时，这是另一种获得相关性的方法。

您可以使用“一般线性模型”，它是线性模型的推广，明确允许残差之间的相关性。下面的代码使用包的gls功能实现了这一点nlme。我相信还有其他方法。要使用此功能，我们必须首先将数据重新整形为“长”格式。为了简单起见，我还将变量名称更改为xand y。我也在你的代码中使用+rnorm(N)了代替+rnorm(1)，因为这就是我认为你的意思。

library(reshape)
library(nlme)
dd <- data.frame(x=Unemployed.3, y=GNP.deflator.3, block=factor(block))
dd$occasion <- factor(rep(1:N, 3))  # variable denoting measurement occasions
dd2 <- melt(dd, id=c("block","occasion"))  # reshape

# fit model with the values within a measurement occasion correlated
#   and different variances allowed for the two variables
mod <- gls(value ~ variable + block, data=dd2, 
           cor=corSymm(form=~1|block/occasion), 
           weights=varIdent(form=~1|variable))  
# extract correlation
mod$modelStruct$corStruct

在建模框架中，您可以使用似然比检验来获得 p 值。nlme也可以给你一个置信区间：

mod2 <- gls(value ~ variable + block, data=dd2, 
           weights=varIdent(form=~1|variable))  
anova(mod, mod2)   # likelihood-ratio test for corr=0

intervals(mod)$corStruct  # confidence interval for the correlation

score 1 · Accepted Answer

如果我正确理解您的问题，您有兴趣计算多个测试之间的类内相关性。psy包里有一个实现，虽然我没用过。

如果您想对相关估计进行推断，您可以引导主题。只需确保将每个样本的测试放在一起。

score 0 · Accepted Answer

我不是专家，但这在我看来就像你想要的。它是自动化的，代码简短，提供与上面示例相同的相关性，并产生 p 值。

> df = data.frame(block=block, Unemployed=Unemployed.3,
+ GNP.deflator=GNP.deflator.3)
> require(plyr)
Loading required package: plyr
> ddply(df, "block", function(x){
+   as.data.frame(
+     with(x,cor.test(Unemployed, GNP.deflator))[c("p.value","estimate")]
+ )})
  block    p.value  estimate
1     a 0.01030636 0.6206334
2     b 0.01030636 0.6206334
3     c 0.01030636 0.6206334

要查看所有详细信息，请执行以下操作：

> dlply(df, "block", function(x){with(x,cor.test(Unemployed, GNP.deflator))})
$a

    Pearson's product-moment correlation

data:  Unemployed and GNP.deflator 
t = 2.9616, df = 14, p-value = 0.01031
alternative hypothesis: true correlation is not equal to 0 
95 percent confidence interval:
 0.1804410 0.8536976 
sample estimates:
      cor 
0.6206334 


$b

    Pearson's product-moment correlation

data:  Unemployed and GNP.deflator 
t = 2.9616, df = 14, p-value = 0.01031
alternative hypothesis: true correlation is not equal to 0 
95 percent confidence interval:
 0.1804410 0.8536976 
sample estimates:
      cor 
0.6206334 


$c

    Pearson's product-moment correlation

data:  Unemployed and GNP.deflator 
t = 2.9616, df = 14, p-value = 0.01031
alternative hypothesis: true correlation is not equal to 0 
95 percent confidence interval:
 0.1804410 0.8536976 
sample estimates:
      cor 
0.6206334 


attr(,"split_type")
[1] "data.frame"
attr(,"split_labels")
  block
1     a
2     b
3     c

r - 怎么做：与“块”（或 - “重复措施”？！）的相关性？

3 回答 3

Related

Reference