根据我的评论,我认为您可以从直接从系数计算预测开始。这是一个将输出与predict.glm
直接在数据上计算的预测概率进行比较的示例:
# construct some data and model it
# y ~ x1 + x2
set.seed(1)
x1 <- runif(100)
x2 <- runif(100)
y <- rbinom(100,1,(x1+x2)/2)
data1 <- data.frame(x1=x1,x2=x2,y=y)
x3 <- runif(100)
x4 <- runif(100)
y2 <- rbinom(100,1,(x3+x4)/2)
data2 <- data.frame(x1=x3,x2=x4,y=y2)
glm1 <- glm(y~x1+x2,data=data1,family=binomial)
# extract coefs
#summary(glm1)
coef1 <- coef(glm1)
# calculate predicted probabilities for current data
tmp1 <- coef1[1] + (data1$x1*coef1[2]) + (data1$x2*coef1[3])
pr1 <- 1/(1+(1/exp(tmp1)))
# these match those from `predict`:
all.equal(pr1,predict(glm1,data1,type='response'))
# now apply to new data:
tmp2 <- coef1[1] + (data2$x1*coef1[2]) + (data2$x2*coef1[3])
pr2 <- 1/(1+(1/exp(tmp2)))
pr2
这显然不是一个通用的解决方案,也不能正确处理不确定性,但我认为这是比 hacking 更好的方法predict
。