当我进行数据分析时,我遇到了一个严重的问题。
我想研究治疗强度如何影响响应变量 y。实地考察是在很久以前的两年前在不同的SITES进行的。据我所知,group_bySITES
并YEAR
获得 y 的均值或总和值并不是很好,尤其是不包括 sum,因为在不同的SITES
.
然后我尝试glmmTMB
使用包构建负二项式模型并检查DHARMa
:我的残差模式不太好。
library(glmmTMB)
model <- glmmTMB::glmmTMB(y~intensity+YEAR+(1|SITES),data = f,family = 'nbinom2')
res = simulateResiduals(model)
res %>% plot()
我试图检查它是否也违反了空间自相关的假设DHARMa
。它表明严重违反假设(p 值 = 0.0083)。所以,我试图解决这个问题。
res2 <- recalculateResiduals(res, group = f$SITES)
testSpatialAutocorrelation(res2,
x = f %>% group_by(SITES) %>% summarise(lat=mean(lat)) %>% pull(lat),
y = f %>% group_by(SITES) %>% summarise(long=mean(long)) %>% pull(long)
)
DHARMa Moran's I test for distance-based autocorrelation
data: res2
observed = 0.1135, expected = -0.0714, sd = 0.0701, p-value = 0.0083
alternative hypothesis: Distance-based autocorrelation
因为通过glmmTMB
允许将坐标添加到方差结构中创建的模型。因此,我尝试使用以下 3 种方法将坐标添加到上述模型:
mat()
gau()
exp()
temp$pos <- numFactor(f$long,f$lat) # coordinate
# mat
FormA1 = formula(y ~ intensity + YEAR + mat(pos + 0 | SITES))
FormA2 = formula(y ~ intensity + YEAR + (1|SITES) + mat(pos + 0 | SITES))
# gau
FormB1 = formula(y ~ intensity + YEAR + gau(pos + 0 | SITES))
FormA2 = formula(y ~ intensity + YEAR + (1|SITES) + gau(pos + 0 | SITES))
# exp
FormA1 = formula(y ~ intensity + YEAR + exp(pos + 0 | SITES))
FormB2 = formula(y ~ intensity + YEAR + (1|SITES) + exp(pos + 0 | SITES))
model <- glmmTMB::glmmTMB(FormA1, #FormA,FormB,FormC I tried one by one.
data = f,
family = 'nbinom2')
但结果是只有exp()
方差结构可以运行。gau()
和的方差结构mat()
会告诉警告:
Warning in fitTMB(TMBStruc) :
Model convergence problem; non-positive-definite Hessian matrix. See vignette('troubleshooting')
Warning in fitTMB(TMBStruc) :
Model convergence problem; false convergence (8). See vignette('troubleshooting')
即使exp()
可以运行,但当我尝试检查空间相关性时,也严重违反了假设(p 值 = 0.0087)。
所以,我放弃了将坐标添加到方差结构中的方法glmmTMB
,转而使用glmmPQL
. 因为,它可以给出correlation variance structure
forMASS::glmmPQL
和类似的gls()
和lme()
在nlme
包中。
相关结构为:
corSpher
corSpatial
corGau
corExp
corRatio
corLin
但是,当我尝试运行如下代码时:
model <- MASS::glmmPQL(y ~ intensity, random = ~ 1|SITES,family = negative.binomial(theta = 0.9),
correlation = corExp(form = ~ long+lat|SITES), #corExp,corGau, corSpher, corLin ,corRatio, all I tried
data = f)
但他们总是显示错误:Error in getCovariate.corSpatial(object, data = data):cannot have zero distances in "corSpatial"
.
考虑到我的实地调查情况,必须有人考虑对我的数据进行分组并获得y的总和或平均值,但问题是:
- 由于组内不同的观察结果,组和
YEAR
ySITES
的总和必须产生偏差误差。 - 当 group by 获得 y 的平均值时可能是合适的,但我不想缩小观察的数量。.
我在和其他网站上寻找了很多stackoverflow
关于如何解决问题的帖子:cannot have zero distances in "corSpatial"
. 但是我没有得到很好的分辨率,包括有人说:障碍是数据中的重复坐标。当然,他说的对,但我已经把lon + lat|SITES
相关结构,它也显示cannot have zero distances in "corSpatial"
。所以,我意识到我必须从互联网上寻求帮助。
所以,我想说的问题是:根据我的数据,如何创建一个不违反残差模式和空间自相关的好的混合线性模型。
当然,你可以使用其他模型,但最好不要使用贝叶斯,因为这对我来说真的很奇怪。
如果您真的想帮助我,我希望您可以发布您的代码,以便我可以再次运行它,而不是在这里输入一些句子。我已经上传了我的部分原始数据,以便为您提供支持。我真的很想得到互联网上所有朋友的帮助。
在我的时区午夜 3 点提前致谢。
这是我的数据:
f =
structure(list(YEAR = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("2000",
"2001"), class = "factor"), SITES = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L,
9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 11L, 11L, 11L, 11L, 11L,
11L, 11L, 11L, 11L, 11L, 11L, 11L, 14L, 14L, 14L, 14L, 14L, 14L,
14L, 14L, 14L, 14L, 14L, 14L, 15L, 15L, 15L, 15L, 15L, 15L, 15L,
15L, 15L, 15L, 15L, 15L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L,
12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L,
13L, 13L, 13L), .Label = c("A1", "A10", "A11", "A12", "A17",
"A18", "A20", "A3", "A8", "A9", "B1", "B14", "B16", "B3", "B6"
), class = "factor"), long = c(80.41081194, 80.41081194, 80.41081194,
80.41081194, 80.41081194, 80.41081194, 80.41081194, 80.41081194,
80.41081194, 80.48462682, 80.48462682, 80.48462682, 80.48462682,
80.48462682, 80.48462682, 80.48462682, 80.48462682, 80.48462682,
80.57384333, 80.57384333, 80.57384333, 80.57384333, 80.57384333,
80.57384333, 80.57384333, 80.57384333, 80.57384333, 80.48125,
80.48125, 80.48125, 80.48125, 80.48125, 80.48125, 80.48125, 80.48125,
80.48125, 80.56281944, 80.56281944, 80.56281944, 80.56281944,
80.56281944, 80.56281944, 80.56281944, 80.56281944, 80.56281944,
80.46875, 80.46875, 80.46875, 80.46875, 80.46875, 80.46875, 80.46875,
80.46875, 80.46875, 80.41125278, 80.41125278, 80.41125278, 80.41125278,
80.41125278, 80.41125278, 80.41125278, 80.41125278, 80.41125278,
80.30014935, 80.30014935, 80.30014935, 80.30014935, 80.30014935,
80.30014935, 80.30014935, 80.30014935, 80.30014935, 80.26167842,
80.26167842, 80.26167842, 80.26167842, 80.26167842, 80.26167842,
80.26167842, 80.26167842, 80.26167842, 80.20558138, 80.20558138,
80.20558138, 80.20558138, 80.20558138, 80.20558138, 80.20558138,
80.20558138, 80.20558138, 80.349776, 80.349776, 80.349776, 80.349776,
80.349776, 80.349776, 80.349776, 80.349776, 80.349776, 80.349776,
80.349776, 80.349776, 80.373821, 80.373821, 80.373821, 80.373821,
80.373821, 80.373821, 80.373821, 80.373821, 80.373821, 80.373821,
80.373821, 80.373821, 80.434011, 80.434011, 80.434011, 80.434011,
80.434011, 80.434011, 80.434011, 80.434011, 80.434011, 80.434011,
80.434011, 80.434011, 80.520748, 80.520748, 80.520748, 80.520748,
80.520748, 80.520748, 80.520748, 80.520748, 80.520748, 80.520748,
80.520748, 80.520748, 80.507795, 80.507795, 80.507795, 80.507795,
80.507795, 80.507795, 80.507795, 80.507795, 80.507795, 80.507795,
80.507795, 80.507795), lat = c(41.17633167, 41.17633167, 41.17633167,
41.17633167, 41.17633167, 41.17633167, 41.17633167, 41.17633167,
41.17633167, 41.12161667, 41.12161667, 41.12161667, 41.12161667,
41.12161667, 41.12161667, 41.12161667, 41.12161667, 41.12161667,
40.97858194, 40.97858194, 40.97858194, 40.97858194, 40.97858194,
40.97858194, 40.97858194, 40.97858194, 40.97858194, 41.03570917,
41.03570917, 41.03570917, 41.03570917, 41.03570917, 41.03570917,
41.03570917, 41.03570917, 41.03570917, 40.80911056, 40.80911056,
40.80911056, 40.80911056, 40.80911056, 40.80911056, 40.80911056,
40.80911056, 40.80911056, 40.80239444, 40.80239444, 40.80239444,
40.80239444, 40.80239444, 40.80239444, 40.80239444, 40.80239444,
40.80239444, 40.82870833, 40.82870833, 40.82870833, 40.82870833,
40.82870833, 40.82870833, 40.82870833, 40.82870833, 40.82870833,
41.02665248, 41.02665248, 41.02665248, 41.02665248, 41.02665248,
41.02665248, 41.02665248, 41.02665248, 41.02665248, 40.99086151,
40.99086151, 40.99086151, 40.99086151, 40.99086151, 40.99086151,
40.99086151, 40.99086151, 40.99086151, 41.00486149, 41.00486149,
41.00486149, 41.00486149, 41.00486149, 41.00486149, 41.00486149,
41.00486149, 41.00486149, 41.021529, 41.021529, 41.021529, 41.021529,
41.021529, 41.021529, 41.021529, 41.021529, 41.021529, 41.021529,
41.021529, 41.021529, 40.970129, 40.970129, 40.970129, 40.970129,
40.970129, 40.970129, 40.970129, 40.970129, 40.970129, 40.970129,
40.970129, 40.970129, 40.85327, 40.85327, 40.85327, 40.85327,
40.85327, 40.85327, 40.85327, 40.85327, 40.85327, 40.85327, 40.85327,
40.85327, 41.045005, 41.045005, 41.045005, 41.045005, 41.045005,
41.045005, 41.045005, 41.045005, 41.045005, 41.045005, 41.045005,
41.045005, 41.083288, 41.083288, 41.083288, 41.083288, 41.083288,
41.083288, 41.083288, 41.083288, 41.083288, 41.083288, 41.083288,
41.083288), y = c(36L, 39L, 45L, 167L, 150L, 113L, 74L, 80L,
72L, 197L, 249L, 212L, 195L, 173L, 105L, 43L, 44L, 27L, 80L,
69L, 52L, 306L, 308L, 188L, 123L, 106L, 121L, 27L, 32L, 26L,
197L, 232L, 245L, 114L, 110L, 170L, 12L, 46L, 57L, 254L, 332L,
207L, 33L, 95L, 89L, 40L, 81L, 77L, 119L, 122L, 194L, 81L, 54L,
71L, 34L, 29L, 33L, 136L, 129L, 189L, 42L, 38L, 69L, 53L, 39L,
55L, 62L, 33L, 49L, 26L, 15L, 31L, 130L, 144L, 79L, 76L, 41L,
86L, 21L, 25L, 16L, 104L, 115L, 109L, 69L, 98L, 84L, 26L, 19L,
30L, 10L, 12L, 4L, 13L, 9L, 4L, 27L, 0L, 4L, 6L, 2L, 6L, 3L,
1L, 0L, 1L, 1L, 1L, 6L, 3L, 5L, 0L, 5L, 2L, 0L, 1L, 2L, 2L, 0L,
2L, 1L, 4L, 1L, 10L, 6L, 9L, 2L, 2L, 2L, 0L, 5L, 0L, 17L, 7L,
20L, 33L, 34L, 45L, 1L, 2L, 0L, 2L, 2L, 2L, 5L, 6L, 3L, 12L,
22L, 12L), intensity = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("low",
"high"), class = "factor")), class = "data.frame", row.names = c(NA,
-150L))