1

我正在尝试使用glmmTMB;运行零膨胀负二项式 GLMM 但是我在模型摘要输出的和值中得到了NAs 。我不确定原因是什么;我遵循了小插图和在线帮助,但我认为我的数据和我尝试使用的技术一定存在问题。我的数据类似于支持文档中使用的示例:负二项分布,零膨胀,具有相同的数据结构。zpSalamanders

问题出在哪里?这些数据适合使用family = nbinom2吗?

数据:

> head(abun_data)
    depl_ID          Keyword_1 depl_dur     logging n AmbientTemperature ElNino
1 B1-1-14_1        Bearded Pig       82 pre-logging 3           23.33333 before
2 B1-1-14_1  Malayan Porcupine       82 pre-logging 0           24.33333 before
3 B1-1-14_1 Pig-tailed Macaque       82 pre-logging 3           24.33333 before
4 B1-1-14_1        Sambar Deer       82 pre-logging 0           24.00000 before
5 B1-1-14_1        Red Muntjac       82 pre-logging 2           24.00000 before
6 B1-1-14_1  Lesser Mouse-deer       82 pre-logging 1           23.00000 before

> str(abun_data)
'data.frame':   1860 obs. of  7 variables:
 $ depl_ID           : Factor w/ 315 levels "B1-1-14_1","B1-1-14_2",..: 1 1 1 1 1 1 2 2 2 2 ...
 $ Keyword_1         : Factor w/ 6 levels "Bearded Pig",..: 1 2 3 4 5 6 1 2 3 4 ...
 $ depl_dur          : num  82 82 82 82 82 82 26 26 26 26 ...
 $ logging           : Factor w/ 3 levels "logging","post-logging",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ n                 : int  3 0 3 0 2 1 2 0 0 0 ...
 $ AmbientTemperature: num  23.3 24.3 24.3 24 24 ...
 $ ElNino            : Factor w/ 3 levels "after","before",..: 2 2 2 2 2 2 2 2 2 2 ...

我的模型:

> zinb <- glmmTMB(n ~ Keyword_1 * logging + (1|depl_ID), zi = ~ Keyword_1 * logging,
+                 data = abun_data, family = "nbinom2")
Warning message:
In fitTMB(TMBStruc) :
  Model convergence problem; non-positive-definite Hessian matrix. See vignette('troubleshooting')
> summary(zinb)
 Family: nbinom2  ( log )
Formula:          n ~ Keyword_1 * logging + (1 | depl_ID)
Zero inflation:     ~Keyword_1 * logging
Data: abun_data

     AIC      BIC   logLik deviance df.resid 
      NA       NA       NA       NA     1822 

Random effects:

Conditional model:
 Groups  Name        Variance Std.Dev.
 depl_ID (Intercept) 0.5413   0.7358  
Number of obs: 1860, groups:  depl_ID, 310

Overdispersion parameter for nbinom2 family (): 1.29 

Conditional model:
                                                 Estimate Std. Error z value Pr(>|z|)
(Intercept)                                       0.99965         NA      NA       NA
Keyword_1Malayan Porcupine                       -1.30985         NA      NA       NA
Keyword_1Pig-tailed Macaque                      -0.90110         NA      NA       NA
Keyword_1Sambar Deer                             -1.34268         NA      NA       NA
Keyword_1Red Muntjac                             -0.76250         NA      NA       NA
Keyword_1Lesser Mouse-deer                      -16.21798         NA      NA       NA
loggingpost-logging                               0.83935         NA      NA       NA
loggingpre-logging                                0.58252         NA      NA       NA
Keyword_1Malayan Porcupine:loggingpost-logging   -0.53276         NA      NA       NA
Keyword_1Pig-tailed Macaque:loggingpost-logging  -5.52093         NA      NA       NA
Keyword_1Sambar Deer:loggingpost-logging         -0.73450         NA      NA       NA
Keyword_1Red Muntjac:loggingpost-logging          0.04825         NA      NA       NA
Keyword_1Lesser Mouse-deer:loggingpost-logging   -9.74912         NA      NA       NA
Keyword_1Malayan Porcupine:loggingpre-logging    -0.18893         NA      NA       NA
Keyword_1Pig-tailed Macaque:loggingpre-logging   -0.08802         NA      NA       NA
Keyword_1Sambar Deer:loggingpre-logging           0.72087         NA      NA       NA
Keyword_1Red Muntjac:loggingpre-logging           0.51223         NA      NA       NA
Keyword_1Lesser Mouse-deer:loggingpre-logging    15.10588         NA      NA       NA

Zero-inflation model:
                                                Estimate Std. Error z value Pr(>|z|)
(Intercept)                                      -1.3469         NA      NA       NA
Keyword_1Malayan Porcupine                      -11.7164         NA      NA       NA
Keyword_1Pig-tailed Macaque                       1.5618         NA      NA       NA
Keyword_1Sambar Deer                              0.6967         NA      NA       NA
Keyword_1Red Muntjac                            -17.6199         NA      NA       NA
Keyword_1Lesser Mouse-deer                       18.7331         NA      NA       NA
loggingpost-logging                             -19.2344         NA      NA       NA
loggingpre-logging                               -2.1708         NA      NA       NA
Keyword_1Malayan Porcupine:loggingpost-logging   32.6525         NA      NA       NA
Keyword_1Pig-tailed Macaque:loggingpost-logging  -1.2560         NA      NA       NA
Keyword_1Sambar Deer:loggingpost-logging         19.1848         NA      NA       NA
Keyword_1Red Muntjac:loggingpost-logging         -3.4218         NA      NA       NA
Keyword_1Lesser Mouse-deer:loggingpost-logging    7.4168         NA      NA       NA
Keyword_1Malayan Porcupine:loggingpre-logging    14.3338         NA      NA       NA
Keyword_1Pig-tailed Macaque:loggingpre-logging  -22.1736         NA      NA       NA
Keyword_1Sambar Deer:loggingpre-logging           1.6785         NA      NA       NA
Keyword_1Red Muntjac:loggingpre-logging          17.0664         NA      NA       NA
Keyword_1Lesser Mouse-deer:loggingpre-logging   -14.3445         NA      NA       NA
4

1 回答 1

2

第一个线索是警告

模型收敛问题;非正定 Hessian 矩阵。见小插图('疑难解答')

这意味着该模型尚未收敛或认为它没有收敛到对数似然曲面向下弯曲的解(即真正的最大值)。这就是为什么无法计算标准误差的原因(如果您进行通常的计算,它们会得出负值或复杂值)。可以计算对数似然,但模型拟合值得怀疑,因此 glmmTMB 返回NA

下一个问题:为什么?有时这很神秘且难以诊断,但在这种情况下,我们有一个很好的线索:当您在(非身份链接)GLM 中看到极端参数值(例如 |beta|>10)时,它几乎总是意味着某种形式的完全分离正在发生。也就是说,有一些协变量组合(例如Keyword_1== Lesser Mouse-deer),您总是有零计数。在对数尺度上,这意味着密度无限低于协变量组合,其中您具有正均值。该参数约为-16,对应于预期的乘法密度差exp(-16) = 1e-07. 这不是无穷小,但它足够小,使得 glmmTMB 在优化器停止的对数似然中获得足够小的差异。但是,由于似然面几乎是平坦的,因此无法计算曲率等。

您可以合并或删除类别或进行某种形式的正则化(例如,请参见此处此处...);Keyword_1将您的变量视为随机效应也可能有意义,这也将具有正则化估计的效果。

于 2020-06-07T01:00:57.910 回答