2

简短的问题是:profile() 返回 12 个参数值。如何使它返回更大的数字?

我提出问题的动机是重现David W. Hosmer Jr.、Stanley Lemeshow 和 Rodney X. Sturdivant (2009)的Applied Logistic Regression 3rd Edition中的图 1.3 ,它绘制了x = age over 的系数对数可能性的轮廓confint() 间隔。

图 1.3 HLS

glm 模型是

fit <- glm(chd ~ age, data = chdage, family = binomial(link = "logit"))

它将 100 名患者的是否患有冠心病与年龄联系起来。该模型的结果与第 1 页正文中的表 1.3 一致。10.

为方便起见,数据的 csv 文件在我的要点中

使用Ben Bolker提供的指导,将 MASS::profile 的偏差输出乘以 -0.5,以使用jebyrnes在稍后对同一帖子的评论中提供的 tidy 函数在2011 年的帖子中转换为负对数似然。

library(dplyr)
library(MASS)
library(purrr)

get_profile_glm <- function(aglm){
  prof <- MASS:::profile.glm(aglm)
  disp <- attr(prof,"summary")$dispersion

  purrr::imap_dfr(prof, .f = ~data.frame(par = .y,
                                deviance=.x$z^2*disp+aglm$deviance, 
                                values = as.data.frame(.x$par.vals)[[.y]],
                                stringsAsFactors = FALSE))

}

pll <- get_profile_glm(fit) %>% filter(par == "age") %>% mutate(beta = values) %>% mutate(pll = deviance * -0.5) %>% select(-c(par,values, deviance))

pll

> pll
      beta    pll
1  0.04895 -57.70
2  0.06134 -56.16
3  0.07374 -55.02
4  0.08613 -54.25
5  0.09853 -53.81
6  0.11092 -53.68
7  0.12332 -53.80
8  0.13571 -54.17
9  0.14811 -54.74
10 0.16050 -55.49
11 0.17290 -56.41
12 0.18529 -57.47

可以绘制此图以获得 HLS 图 1.3 的近似值,其中 alpha = 0.95 区间的 ablines 从

confint(fit)

logLik(fit)

图例中的不对称性可以用

asymmetry <- function(x) {
    ci <- confint(x, level = 0.95)
    ci_lower <- ci[2,1]
    ci_upper <- ci[2,2]
    coeff <- x$coefficients[2]
    round(100 * ((ci_upper - coeff) - (coeff - ci_lower))/(ci_upper - ci_lower), 2)
}

asym <- assymetry(fit)

情节是用

ggplot(data = pll, aes(x = beta, y = pll)) + 
geom_line() + 
scale_x_continuous(breaks = c(0.06, 0.08, 0.10, 0.12, 0.14, 0.16)) + 
scale_y_continuous(breaks = c(-57, -56, -55, -54)) + 
xlab("Coefficient for age") + 
ylab("Profile log-likelihood function") + 
geom_vline(xintercept = confint(fit)[2,1]) + 
geom_vline(xintercept = confint(fit)[2,2]) + 
geom_hline(yintercept = (logLik(fit) - (qchisq(0.95, df = 1)/2))) + 
theme_classic() + 
ggtitle(paste("Asymmetry =", scales::percent(asym/100, accuracy = 0.1))) + 
theme(plot.title = element_text(hjust = 0.5))

需要调整的地块

需要进行两项调整:

  1. 曲线应通过分别沿 x 和 y 轴添加 beta 值和对数似然值来平滑。

  2. beta 的范围应设置为大约 [0.0575,0.1625](视觉上,从图中)。我认为这可以根据需要通过子集来完成。

关于图中 logLik y 截距的注释。它似乎是基于对数似然的转置值。请参阅第 1-3 页的表 1-3。10,其中它给出为 -53.676546,与 p 上的等式相比。19 转置为 -53.6756。

4

1 回答 1

1

Thanks to prompting from kjetil b halvorsen and the comment by Ben Bolker in the bbmle package vignette for mle2

– del Step size (scaled by standard error) (Default: zmax/5.) Presum- ably (?) copied from MASS::profile.glm, which says (in ?profile.glm): “[d]efault value chosen to allow profiling at about 10 parameter values.”</p>

I "solved" my issue with a change to the get_profile_glm function in the original post:

prof <- MASS:::profile.glm(aglm, del = .05, maxsteps = 52)

which yielded 100 points and produced the plot I was hoping for:

Figure 1.3

I say "solved" because the values hardwired were determined by trial and error. But it must do for now.

于 2019-12-16T01:38:19.770 回答