我正在尝试从bayesplot
一个对象的包中实现函数,INLA
并且有点不确定如何从后验预测分布中提取。我想我几乎拥有它,但rstan
平局比平局更具可INLA
变性。
在rstan
中,使用bayesplot
小插图中的简化示例,我可以:
library(bayesplot)
library(ggplot2)
library(rstanarm)
library(ggpubr)
library(tidyverse)
#rstan model set up
roaches$roach100 <- roaches$roach1 / 100 # pre-treatment number of roaches (in 100s)
fit_poisson <- stan_glm(y ~ roach100 + treatment + senior, offset = log(exposure2), family = poisson(link = "log"), data = roaches, seed = 1111, refresh = 0)
#In order to use the PPC functions from the bayesplot package we need a vector y of outcome values:
y <- roaches$y
#and a matrix yrep of draws from the posterior predictive distribution,
yrep_poisson <- posterior_predict(fit_poisson, draws = 500)
#then plot:
p1 <- bayesplot::ppc_dens_overlay(y, yrep_poisson[1:50, ])
p1
我想在一个INLA
对象上复制该图。根据bayesplot
小插图,您可以这样做,因为他们提供了代码来定义一个简单的pp_check
方法来创建类的拟合模型对象,例如foo
:
pp_check.foo <- function(object, type = c("multiple", "overlaid"), ...) {
type <- match.arg(type)
y <- object[["y"]]
yrep <- object[["yrep"]]
stopifnot(nrow(yrep) >= 50)
samp <- sample(nrow(yrep), size = ifelse(type == "overlaid", 50, 5))
yrep <- yrep[samp, ]
if (type == "overlaid") {
ppc_dens_overlay(y, yrep, ...)
} else {
ppc_hist(y, yrep, ...)
}
}
要使用pp_check.foo
,我们可以用y
和yrep
组件创建一个列表,并给它类 foo:
x <- list(y = rnorm(200), yrep = matrix(rnorm(1e5), nrow = 500, ncol = 200))
class(x) <- "foo"
#create plot above:
pp_check(x, type = "overlaid")
英拉
#create same model but in inla:
library(INLA)
fit_poisson_inla <- inla(y ~ roach100 + treatment + senior, offset = log(exposure2), data = roaches,
control.predictor = list(compute = T),
family = "poisson")
inla_object_name$marginals.fitted.values
返回每个 的后验预测分布y
:
fit_poisson_inla$marginals.fitted.values
#so to get distribution for first oberservation:
fitted.Predictor.1 <- fit_poisson_inla$marginals.fitted.values[[1]]
我认为从这个重复采样会给我我需要的东西,但只有 75 个值(dim(fitted.Predictor.1)
每个观察值用于创建这个分布,而实际上我想从全范围的值中采样。我认为我们可以做到这一点(部分4.3这里)通过inla.tmarginal
使用线性预测器:
fitted_dist <- fit_poisson_inla$marginals.linear.predictor
#should i have used "inla.rmarginal(n, marginal)"?
marginal_dist <- lapply(fitted_dist, function(y) inla.tmarginal(function(x) {exp(x)}, y)) %>% map(~ as.data.frame(.) %>% rename(., xx = x))
#resample 500 times
yrep_poisson_inla <- as.matrix(bind_rows(rerun(500, lapply(marginal_dist, function(x) sample(x$xx, 1)) %>% as.data.frame())))
#convert to class foo for pp_check
x <- list(y = y, yrep = yrep_poisson_inla[1:50, ])
class(x) <- "foo"
p2 <- pp_check(x, type = "overlaid")
#plot
ggarrange(p1, p2, ncol = 1, nrow = 2, labels = c("rstan", "inla sample"))
我的问题是我如何正确地从这个inla
( fit_poisson_inla
) 对象的后验预测分布中获取一个绘制矩阵以传递到pp_check
? yrep_poisson
产生离散值,同时yrep_poisson_inla
产生连续值。rstan
平局的变化比INLA
(第二个情节)要多得多。我所做的是否正确,这只是一些抽样问题,还是不同方法的产物?在更复杂的示例中,差异可能很大。
谢谢