r - 在不使用新参考水平重新拟合线性模型的情况下获取组均值差的 p 值

Question

当我们有一个带有因子变量的线性模型X（带有A、B和C）

y ~ factor(X) + Var2 + Var3

结果显示估计值XB和XC是差异B - A和C - A。（假设参考是A）。

如果我们想知道B和C:之间的差异的 p 值C - B，我们应该指定 B 或 C 作为参考组并重新运行模型。

我们能同时得到效果B - A、C - A和的 p 值C - B吗？

score 3 · Accepted Answer

您正在通过检查回归系数的某些线性组合的 p 值来寻找线性假设检验。根据我的回答：如何使用聚类协方差矩阵对回归系数进行线性假设检验？，在我们只考虑系数总和的情况下，我将扩展函数LinearCombTest以处理更一般的情况，假设alpha中的变量的一些组合系数vars：

LinearCombTest <- function (lmObject, vars, alpha, .vcov = NULL) {
  ## if `.vcov` missing, use the one returned by `lm`
  if (is.null(.vcov)) .vcov <- vcov(lmObject)
  ## estimated coefficients
  beta <- coef(lmObject)
  ## linear combination of `vars` with combination coefficients `alpha`
  LinearComb <- sum(beta[vars] * alpha)
  ## get standard errors for sum of `LinearComb`
  LinearComb_se <- sum(alpha * crossprod(.vcov[vars, vars], alpha)) ^ 0.5
  ## perform t-test on `sumvars`
  tscore <- LinearComb / LinearComb_se
  pvalue <- 2 * pt(abs(tscore), lmObject$df.residual, lower.tail = FALSE)
  ## return a matrix
  form <- paste0("(", paste(alpha, vars, sep = " * "), ")")
  form <- paste0(paste0(form, collapse = " + "), " = 0")
  matrix(c(LinearComb, LinearComb_se, tscore, pvalue), nrow = 1L,
         dimnames = list(form, c("Estimate", "Std. Error", "t value", "Pr(>|t|)")))
  }

考虑一个简单的例子，我们对三个组和进行平衡设计A，组均值分别为 0、1、2。BC

x <- gl(3,100,labels = LETTERS[1:3])
set.seed(0)
y <- c(rnorm(100, 0), rnorm(100, 1), rnorm(100, 2)) + 0.1

fit <- lm(y ~ x)
coef(summary(fit))

#             Estimate Std. Error   t value     Pr(>|t|)
#(Intercept) 0.1226684 0.09692277  1.265631 2.066372e-01
#xB          0.9317800 0.13706949  6.797866 5.823987e-11
#xC          2.0445528 0.13706949 14.916177 6.141008e-38

既然A是参考水平，xB就是在给B - A，在xC给C - A。B假设我们现在对 group和之间的区别感兴趣C，即C - B，我们可以使用

LinearCombTest(fit, c("xC", "xB"), c(1, -1))

#                         Estimate Std. Error  t value     Pr(>|t|)
#(1 * xC) + (-1 * xB) = 0 1.112773  0.1370695 8.118312 1.270686e-14

注意，这个函数也很方便计算 and 的组均值B，C即(Intercept) + xBand (Intercept) + xC：

LinearCombTest(fit, c("(Intercept)", "xB"), c(1, 1))

#                                 Estimate Std. Error  t value     Pr(>|t|)
#(1 * (Intercept)) + (1 * xB) = 0 1.054448 0.09692277 10.87926 2.007956e-23

LinearCombTest(fit, c("(Intercept)", "xC"), c(1, 1))

#                                 Estimate Std. Error  t value     Pr(>|t|)
#(1 * (Intercept)) + (1 * xC) = 0 2.167221 0.09692277 22.36029 1.272811e-65

替代解决方案lsmeans

再次考虑上面的玩具示例：

library(lsmeans)
lsmeans(fit, spec = "x", contr = "revpairwise")

#$lsmeans
# x    lsmean         SE  df    lower.CL  upper.CL
# A 0.1226684 0.09692277 297 -0.06807396 0.3134109
# B 1.0544484 0.09692277 297  0.86370603 1.2451909
# C 2.1672213 0.09692277 297  1.97647888 2.3579637
#
#Confidence level used: 0.95 
#
#$contrasts
# contrast estimate        SE  df t.ratio p.value
# B - A    0.931780 0.1370695 297   6.798  <.0001
# C - A    2.044553 0.1370695 297  14.916  <.0001
# C - B    1.112773 0.1370695 297   8.118  <.0001
#
#P value adjustment: tukey method for comparing a family of 3 estimates

该$lsmeans域返回边际组均值，而$contrasts返回成对组均值差，因为我们使用了“revpairwise”对比。阅读第 32 页的和lsmeans之间的区别。"pairwise""revpairwise"

好吧，这当然很有趣，因为我们可以将结果与LinearCombTest. 我们看到这LinearCombTest是正确的。

score 1 · Accepted Answer

glht包中的（一般线性假设检验）multcomp使这种多假设检验变得容易，而无需重新运行一堆单独的模型。它本质上是根据您定义的感兴趣的比较来制作定制的对比度矩阵。

使用您的示例比较并基于@ZheyuanLi 提供的数据：

x <- gl(3,100,labels = LETTERS[1:3])
set.seed(0)
y <- c(rnorm(100, 0), rnorm(100, 1), rnorm(100, 2)) + 0.1

fit <- lm(y ~ x)

library(multcomp)
my_ht <- glht(fit, linfct = mcp(x = c("B-A = 0",
                             "C-A = 0",
                             "C-B = 0")))

summary(my_ht)将为您提供调整后的 p 值以进行感兴趣的比较。

#Linear Hypotheses:
#           Estimate Std. Error t value Pr(>|t|)    
#B - A == 0   0.9318     0.1371   6.798 1.11e-10 ***
#C - A == 0   2.0446     0.1371  14.916  < 1e-10 ***
#C - B == 0   1.1128     0.1371   8.118  < 1e-10 ***

score 0 · Accepted Answer

您可以使用该库car，并将该函数linearHypothesis与参数一起使用vcov。

将此设置为variance-covariance模型的矩阵。

该函数采用公式或矩阵来描述您要测试的方程组。

r - 在不使用新参考水平重新拟合线性模型的情况下获取组均值差的 p 值

3 回答 3

Related

Reference