r - Custom Link function works for GLM but not mgcv GAM

Question

Apologies if the answer is obvious but I've spent quite some time trying to use a custom link function in mgcv.gam

In short,

I want to use a modified probit link from package psyphy ( I want to use psyphy.probit_2asym, I call it custom_link )
I can create a {stats}family object with this link and use it in the 'family' argument of glm.

m <- glm(y~x, family=binomial(link=custom_link), ... )
It does not work when used as an argument for {mgcv}gam

m <- gam(y~s(x), family=binomial(link=custom_link), ... )

I get the error Error in fix.family.link.family(family) : link not recognised

I do not get the reason for this error, both glm and gam work if I specify the standard link=probit.

So my question can be summarized as:

what is missing in this custom link that works for glm but not for gam?

Thanks in advance if you can give me a hint on what I should do.

Link function

probit.2asym <- function(g, lam) {
    if ((g < 0 ) || (g > 1))
        stop("g must in (0, 1)")
    if ((lam < 0) || (lam > 1))
        stop("lam outside (0, 1)")
    linkfun <- function(mu) {
        mu <- pmin(mu, 1 - (lam + .Machine$double.eps))
        mu <- pmax(mu, g + .Machine$double.eps)
        qnorm((mu - g)/(1 - g - lam))
        }
    linkinv <- function(eta) {
        g + (1 - g - lam) * 
         pnorm(eta)
        }
    mu.eta <- function(eta) {
        (1 - g - lam) * dnorm(eta)      }
    valideta <- function(eta) TRUE
    link <- paste("probit.2asym(", g, ", ", lam, ")", sep = "")
    structure(list(linkfun = linkfun, linkinv = linkinv, 
    mu.eta = mu.eta, valideta = valideta, name = link), 
    class = "link-glm")
}

score 4 · Accepted Answer

您可能知道，glm采用迭代重新加权最小二乘拟合迭代。的早期版本gam通过拟合迭代惩罚的重新加权最小二乘来扩展这一点，这是由gam.fit函数完成的。在某些情况下，这称为性能迭代。

自 2008 年以来（或者甚至更早），gam.fit3基于所谓的外部迭代已被gam.fit默认gam替换。这种变化确实需要一些额外的家庭信息，您可以阅读有关这些信息?fix.family.link。

两次迭代的主要区别在于系数beta迭代和平滑参数迭代lambda是否嵌套。

性能迭代采用嵌套方式，对的每次更新beta，对进行一次迭代lambda；
外部迭代将这 2 次迭代完全分开，其中对于的每次更新beta，的迭代lambda进行到最后直到收敛。

显然外迭代更稳定，不太可能出现收敛失败。

gam有论据optimizer。默认采用optimizer = c("outer", "newton")，即外迭代的牛顿法；但如果你设置optimizer = "perf"，它将需要性能迭代。

因此，在上述概述之后，我们有两个选择：

仍然使用外部迭代，但扩展您自定义的链接功能；
使用性能迭代来保持与glm.

我很懒所以将展示第二种方法（实际上我对采用第一种方法并不太自信）。

可重现的例子

你没有提供一个可重现的例子，所以我准备了一个如下。

set.seed(0)
x <- sort(runif(500, 0, 1))    ## covariates (sorted to make plotting easier)
eta <- -4 + 3 * x * exp(x) - 2 * log(x) * sqrt(x)   ## true linear predictor
p <- binomial(link = "logit")$linkinv(eta)    ## true probability (response)
y <- rbinom(500, 1, p)    ## binary observations

table(y)    ## a quick check that data are not skewed
#  0   1 
#271 229

我将采用您打算使用g = 0.1的lam = 0.1功能：probit.2asym

probit2 <- probit.2asym(0.1, 0.1)

par(mfrow = c(1,3))

## fit a glm with logit link
glm_logit <- glm(y ~ x, family = binomial(link = "logit"))
plot(x, eta, type = "l", main = "glm with logit link")
lines(x, glm_logit$linear.predictors, col = 2)

## glm with probit.2asym
glm_probit2 <- glm(y ~ x, family = binomial(link = probit2))
plot(x, eta, type = "l", main = "glm with probit2")
lines(x, glm_probit2$linear.predictors, col = 2)

## gam with probit.2aysm
library(mgcv)
gam_probit2 <- gam(y ~ s(x, bs = 'cr', k = 3), family = binomial(link = probit2),
                   optimizer = "perf")
plot(x, eta, type = "l", main = "gam with probit2")
lines(x, gam_probit2$linear.predictors, col = 2)

我使用了自然三次样条基础cr，s(x)对于单变量平滑，不需要使用薄板样条的默认设置。我还设置了一个小的基础维度k = 3（三次样条不能更小），因为我的玩具数据接近线性并且不需要大的基础维度。更重要的是，这似乎可以防止我的玩具数据集的性能迭代收敛失败。

r - Custom Link function works for GLM but not mgcv GAM

1 回答 1

Related

Reference