我是生存分析的新手。我试图了解将未观察到任何数据的测量值包含到数据集中是否重要。
我有以下虚构数据:三个患者,
- 观察到第一位患者在 12 时患病,
- 第二次在时间 13 观察两次,在时间 14 观察一次以无病,并且
- 观察到第三位患者在时间 1 有病。
我尝试了以下两个示例
require(flexsurv)
surv_test <- with(data.frame(status = c(1,0,0,1), time = c(12L, 13L,1L, 1L)), Surv(time, status))
flexsurvreg(surv_test~1, dist = "weibull")
#Call:
#flexsurvreg(formula = surv_test ~ 1, dist = "weibull")
#Estimates:
# est L95% U95% se
#shape 0.937 0.295 2.973 0.552
#scale 13.755 3.001 63.035 10.683
#N = 4, Events: 2, Censored: 2
#Total time at risk: 27
#Log-likelihood = -7.199135, df = 2
#AIC = 18.39827
和
surv_test <- with(data.frame(status = c(1,0,1), time = c(12L, 14L, 1L)), Surv(time, status))
flexsurvreg(surv_test~1, dist = "weibull")
#Call:
#flexsurvreg(formula = surv_test ~ 1, dist = "weibull")
#Estimates:
# est L95% U95% se
#shape 0.844 0.244 2.922 0.535
#scale 13.883 2.635 73.140 11.770
#N = 3, Events: 2, Censored: 1
#Total time at risk: 27
#Log-likelihood = -7.167346, df = 2
#AIC = 18.33469
结果表明两者之间存在明显差异,我想知道是否有人可以解释为什么包括患者没有生病的观察结果很重要。谢谢!