6

我想绘制一个多变量逻辑回归分析(GLM)的结果,用于调整特定自变量(即独立于模型中包含的混杂因素)与结果(二元)的关系。

我看到帖子推荐使用以下predict命令的方法,后跟curve,这是一个示例;

x     <- data.frame(binary.outcome, cont.exposure)
model <- glm(binary.outcome ~ cont.exposure, family=binomial, data=x)
plot(cont.exposure, binary.outcome, xlab="Temperature",ylab="Probability of Response") 
curve(predict(model, data.frame(cont.exposure=x), type="resp"), add=TRUE, col="red")

然而,这似乎不适用于多元回归模型。当我添加“年龄”(任意 - 可以是任何相同长度的变量)作为混杂变量时,出现以下错误;

> x     <- data.frame(binary.outcome, cont.exposure, age)
> model <- glm(binary.outcome ~ cont.exposure + age, family=binomial, data=x)
> plot(cont.exposure, binary.outcome, xlab="Temperature",ylab="Probability of Response") 
> curve(predict(model, data.frame(cont.exposure=x), type="resp"), add=TRUE, col="red")
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : 
  variable lengths differ (found for 'age')
In addition: Warning message:
  'newdata' had 101 rows but variable(s) found have 698 rows 

上面的模型是我想跑的模型的简化版,但是原理是一样的;我想绘制二进制结果变量和连续暴露之间的关系,独立于混杂因素。.

获得上述解决方法或查看我感兴趣的关系的替代方法会很棒。非常感谢。

4

1 回答 1

8
set.seed(12345)
dataset <- expand.grid(Temp = rnorm(30), Age = runif(10))
dataset$Truth <- with(dataset, plogis(2 * Temp - 3 * Age))
dataset$Sample <- rbinom(nrow(dataset), size = 1, prob = dataset$Truth)
model <- glm(Sample ~ Temp + Age, data = dataset, family = binomial)
newdata <- expand.grid(
  Temp = pretty(dataset$Temp, 20), 
  Age = pretty(dataset$Age, 5))
newdata$Sample <- predict(model, newdata = newdata, type = "response")
library(ggplot2)
ggplot(newdata, aes(x = Temp, y = Sample)) + geom_line() + facet_wrap(~Age)

在此处输入图像描述

ggplot(newdata, aes(x = Temp, y = Sample, colour = Age, group = Age)) + 
  geom_line()

在此处输入图像描述

于 2012-07-02T11:47:16.220 回答