我想比较三种情况下动物的皮质醇水平:
- 皮质醇:皮质醇植入物(实验性)
- 基线:非压力条件(非实验性)
- 驱逐:被驱逐出群体后,据称是压力条件(非实验性)
我的目标有两个:
i) 将皮质醇治疗诱导的皮质醇水平与非实验皮质醇水平(基线和驱逐)进行比较。
ii) 确定驱逐条件是否有压力,即诱发比基线更高的皮质醇水平
为了回答这个问题,我在 glmmTMB 中指定了一个 glmm,带有 gamma 错误结构和日志链接。我将采样时间(居中)添加为独立模型协变量,以说明采样过程可能会带来压力并影响皮质醇水平的可能性。皮质醇被设置为参考水平。
df <- structure(list(GroupID = c(2L, 2L, 2L, 2L, 2L, 6L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L,
12L, 14L, 14L, 14L, 14L, 14L, 14L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 9L, 9L, 9L,
9L, 14L, 14L, 14L, 1L, 1L, 1L, 1L, 1L, 1L, 8L, 7L, 10L, 13L,
14L, 11L), AnimalID = c(2L, 2L, 2L, 2L, 2L, 14L, 16L, 17L, 17L,
17L, 18L, 18L, 18L, 21L, 21L, 21L, 22L, 22L, 22L, 23L, 24L, 24L,
24L, 27L, 27L, 27L, 28L, 28L, 29L, 1L, 1L, 2L, 2L, 3L, 3L, 4L,
4L, 5L, 5L, 6L, 6L, 12L, 12L, 13L, 13L, 14L, 14L, 15L, 15L, 19L,
19L, 20L, 20L, 26L, 26L, 28L, 7L, 8L, 9L, 9L, 10L, 11L, 17L,
18L, 22L, 25L, 27L, 28L), Condition = structure(c(2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L), .Label = c("Cortisol", "Baseline", "Eviction"), class = "factor"),
SamplingTime = c(66.2205882352941, 15.2205882352941, -32.7794117647059,
-35.7794117647059, 39.2205882352941, -17.7794117647059, 19.2205882352941,
70.2205882352941, 13.2205882352941, 20.2205882352941, 81.2205882352941,
29.2205882352941, 20.2205882352941, -40.7794117647059, -35.7794117647059,
29.2205882352941, 74.2205882352941, 11.2205882352941, 5.22058823529412,
48.2205882352941, 71.2205882352941, -31.7794117647059, 9.22058823529412,
15.2205882352941, 0.220588235294116, -21.7794117647059, -28.7794117647059,
-28.7794117647059, 18.2205882352941, 9.22058823529412, -43.7794117647059,
-17.7794117647059, -13.7794117647059, -10.7794117647059,
-34.7794117647059, 2.22058823529412, 45.2205882352941, -47.7794117647059,
-41.7794117647059, -25.7794117647059, -34.7794117647059,
-45.7794117647059, 74.2205882352941, -24.7794117647059, -16.7794117647059,
-25.7794117647059, -35.7794117647059, -8.77941176470588,
94.2205882352941, 7.22058823529412, -18.7794117647059, 33.2205882352941,
-19.7794117647059, -39.7794117647059, -17.7794117647059,
10.2205882352941, 62.2205882352941, -31.7794117647059, -25.7794117647059,
-35.7794117647059, -18.7794117647059, -33.7794117647059,
-29.7794117647059, -12.7794117647059, 41.2205882352941, 18.2205882352941,
-21.7794117647059, -44.7794117647059), Cort = c(24.34332718,
122.3662678, 14.81649737, 29.25135246, 61.35316584, 51.13096882,
16.45238313, 72.75506886, 65.24373444, 19.83977651, 93.95373749,
88.76846447, 25.9247257, 11.10998092, 11.02192194, 55.65196428,
54.5021593, 70.57177391, 15.52706931, 42.8042128, 34.32402168,
11.04777398, 32.55449895, 16.06094915, 4.51891805, 44.2694076,
43.58813571, 25.42779488, 84.73124038, 41.02017209, 66.83361158,
80.80109134, 223.2265155, 106.2662754, 106.3685473, 71.98187244,
46.16678106, 115.7921086, 43.22291137, 97.94202063, 85.25515085,
141.7936754, 35.25707239, 62.46268318, 34.80581329, 131.9996656,
70.74119652, 201.1820693, 73.89806583, 171.1220827, 104.5034774,
97.57763406, 38.68277741, 105.7560627, 80.10706126, 38.21521771,
173.2503328, 31.86197343, 9.51105186, 49.09217616, 80.62046751,
94.19424573, 69.65166083, 44.07333383, 118.3258082, 33.46058913,
44.2694076, 43.56762495)), row.names = c(NA, -68L), class = "data.frame")
library(glmmTMB)
m1 <- glmmTMB(Cort~Condition+SamplingTime + (1|GroupID/AnimalID),family=Gamma ("log"),data = df)
summary(m1)
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.59766 0.12282 37.44 < 2e-16 ***
ConditionBaseline -0.94822 0.18061 -5.25 1.52e-07 ***
ConditionEviction -0.40664 0.21158 -1.92 0.0546 .
SamplingTime 0.00521 0.00229 2.28 0.0229 *
来自模型输出的信息可以回答 i)。但是,我需要在 Eviction 和 Baseline 之间进行一次额外的比较来回答 ii),这可以通过使用 multcomp 包对均值进行多次比较来完成
library(multcomp)
summary(glht(m1, mcp(Condition = "Tukey")))
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Fit: glmmTMB(formula = Cort ~ Condition + SamplingTime + (1 | GroupID/AnimalID),
data = data, family = Gamma("log"), ziformula = ~0,
dispformula = ~1)
Linear Hypotheses:
Estimate Std. Error z value Pr(>|z|)
Baseline - Cortisol == 0 -0.9482 0.1806 -5.250 <0.001 ***
Eviction - Cortisol == 0 -0.4066 0.2116 -1.922 0.1316
Eviction - Baseline == 0 0.5416 0.2133 2.539 0.0297 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)
尽管直接从模型摘要(基线 - 皮质醇和驱逐 - 皮质醇)中获得的对比的估计值、标准误差和检验统计值与模型摘要中的相同,但 p 值不同。
这让我想知道:
glmm 输出与使用 glht() 的平均值的多重比较之间 p 值差异的原因是什么?来自 glmm 的 p 值是否也针对多重比较进行了调整,即在我们的案例中,基线和驱逐都与皮质醇进行对比?
这就提出了报告哪个值的问题。我们是否应该报告 a) 模型中估计的对比的模型摘要的 p 值和摘要中不直接可用的对比的多重比较的 p 值,即在我们的例子中,驱逐和基线之间的对比?或 b) 所有对比均值的多重比较的 p 值?
更一般地说,如果模型摘要中未提供一个先验感兴趣的对比,那么所有感兴趣的对比是否都必须包含在均值的多重比较中,还是仅包含在模型摘要中不可用的对比?在我们的例子中:
summary(glht(m1, linfct = mcp(Condition = c("Eviction - Baseline = 0")), type = "Tukey"))
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: User-defined Contrasts
Fit: glmmTMB(formula = Cort ~ Condition + SamplingTime + (1 | GroupID/AnimalID),
data = data, family = Gamma("log"), ziformula = ~0,
dispformula = ~1)
Linear Hypotheses:
Estimate Std. Error z value Pr(>|z|)
Eviction - Baseline == 0 0.5416 0.2133 2.539 0.0111 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)