我最初在 Cross Validated Stackexchange 上发布了这个问题,但没有得到答案。因此,我决定在这里试一试。我试图弄清楚如何获得具有随机截距和斜率的分段线性混合效应模型(装有 nlme 包)的 lsmeans。我的数据代表了一组男性和女性学生在引入日常冥想之前和之后每周参加考试的数学成绩。创建数据框并拟合模型的最小可重复示例如下:
library(nlme)
library("lsmeans")
# Subject's ID
ID <- c(1,1,1,
2,2,2,
3,3,3,
4,4,4,
5,5,5,
6,6,6,
7,7,7,
8,8,8)
# Time (weeks) before introduction of routine
time1 <- c(-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0)
# Time (weeks) before introduction of routine
time2 <- c(0,0,1,
0,0,1,
0,0,1,
0,0,1,
0,0,1,
0,0,1,
0,0,1,
0,0,1)
# week test math scores
mscore <- c(80,92,73,
75,80,85,
60,75,70,
75,80,75,
78,84,75,
78,91,95,
64,72,71,
84,92,70)
# create dataframe
longdata <-data.frame(ID, time1, time2, mscore)
head(longdata)
# fit model
pwmodel <- lme(mscore ~ time1+time2,
random =~ time1+time2|ID,
data=longdata,
method="ML")
# calculate marginal means:
#with both variables
lsmeans(pwmodel, ~(time1+time2),
at=list(time1=c(-1,0,1), time2=c(-1,0,1)) )
# only with one variable
lsmeans(pwmodel, ~time1,
at=list(time1=c(-1,0,1) ))
这里的 time1 和 time2 代表每日冥想程序开始之前和之后的时间。
问题是:在 -1、0 和 1 时从该模型中获取 lsmeans(或 emmeans,如果更好的话)的正确方法是什么?考虑两个时间变量还是仅考虑其中一个(time1 或 time2)?
两种方法的输出如下所示:
> #with both variables
> lsmeans(pwmodel, ~(time1+time2),
+ at=list(time1=c(-1,0,1), time2=c(-1,0,1)) )
time1 time2 lsmean SE df lower.CL upper.CL
-1 -1 80.8 5.46 7 67.8 93.7
0 -1 89.8 5.47 7 76.8 102.7
1 -1 98.8 5.81 7 85.0 112.5
-1 0 74.2 2.88 7 67.4 81.1
0 0 83.2 2.77 7 76.7 89.8
1 0 92.2 3.28 7 84.5 100.0
-1 1 67.8 3.34 7 59.9 75.6
0 1 76.8 3.12 7 69.4 84.1
1 1 85.8 3.47 7 77.5 94.0
Degrees-of-freedom method: containment
Confidence level used: 0.95
> # only with one variable
> lsmeans(pwmodel, ~time1,
+ at=list(time1=c(-1,0,1) ))
time1 lsmean SE df lower.CL upper.CL
-1 71 2.59 7 64.9 77.1
0 80 2.38 7 74.4 85.6
1 89 2.89 7 82.2 95.8
Results are averaged over the levels of: time2
Degrees-of-freedom method: containment
Confidence level used: 0.95
它们显然返回不同的结果,但两种方式不应该给出相同的值吗?