我正在 R 中进行基于仿真的功率分析。我通过 RStudio (0.98.932) 运行 R,使用函数plyr::rdply
并lme4::glmer
分别生成数据和拟合模型(请参阅下面的可重现示例的结尾以了解 R 环境和包版本)。
该过程是随机生成给定参数化的数据集并对其进行模型拟合。然而,该模型时不时地无法收敛。当发生以下警告时
[1]“无法评估缩放梯度”
[2]“模型无法收敛:具有 1 个负特征值的退化 Hessian”
R 进入浏览器模式,我必须手动干预(例如按c
)才能返回模拟循环。这是一个真正的痛苦,因为我需要在几天内运行数千次迭代,但每次遇到这个特定的收敛错误时,它都会停止,直到我按下一个键。
有没有办法避免 R 进入浏览器模式?我存储了每次模拟中出现的所有警告,所以我唯一的问题是当这种特定的收敛失败发生时我必须手动干预。我尝试使用purrr::quietly
andpurrr::safely
函数,但没有成功(参见下面代码中的示例)。
这是一个可以在我的计算机上运行的 MWE(我set.seed
用于重现性,所以我希望它能够独立于包版本等产生相同的结果)。在示例中,我应用了相同的逻辑,但不同且更简单的参数化,正如我在实际模拟中使用的那样:
library(lme4)
library(plyr)
library(purrr)
# function to generate data that will lead to convergence failure
mini_simulator <- function() {
nb_items <- 10 # observations per subject
nb_subj <- 10 # subjects per group
generate_data <- function() {
A <- rbinom(nb_items * nb_subj, 1, .99)
B <- rbinom(nb_items * nb_subj, 1, .8)
simdata <- data.frame(
Group = rep(c("A", "B"), each = nb_items * nb_subj),
Subj = rep(1 : (nb_subj * 2), each = nb_items),
Items = 1:nb_items,
Response = c(A, B)
)
}
}
# Sanity check that the function is generating data appropriately.
# d should be a dataframe with 200 obs. of 4 variables
d <- mini_simulator()()
head(d, 3)
# Group Subj Items Response
# 1 A 1 1 1
# 2 A 1 2 1
# 3 A 1 3 1
rm(d)
## Functions to fit model
# basic function to fit model on simulated data
fit_model <- function(data_sim) {
fm <- glmer(
formula = Response ~ Group + (1|Subj) + (1|Items),
data = data_sim, family = "binomial")
out <- data.frame(summary(fm)$coef)
out
}
# similar but using purrr::quietly (also tried purrr::safely with no success)
# see http://r4ds.had.co.nz/lists.html section "Dealing with failure"
fit_model_quietly <- function(data_sim) {
purrr_out <- purrr::quietly(glmer)(
formula = Response ~ Group + (1|Subj) + (1|Items),
data = data_sim, family = "binomial")
fm <- purrr_out$result
out <- data.frame(summary(fm)$coef)
# keeps track of convergence failures and other warnings
out$Warnings <- paste(unlist(purrr_out$warnings), collapse = "; ")
out
}
# this seed creates the problematic convergence failure on the first evaluation
# of rdply
set.seed(2)
# When I run the next line R goes into Browse mode and I need to enter "c"
# in order to continue
simulations <- plyr::rdply(.n = 3, fit_model(mini_simulator()()))
simulations
# problem persists using the quietly adverb from purrr
set.seed(2)
simulations <- plyr::rdply(.n = 3, fit_model_quietly(mini_simulator()()))
simulations
# sessionInfo()
# R version 3.1.2 (2014-10-31)
# Platform: i386-w64-mingw32/i386 (32-bit)
#
# locale:
# [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 LC_MONETARY=Swedish_Sweden.1252
# [4] LC_NUMERIC=C LC_TIME=Swedish_Sweden.1252
#
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
# other attached packages:
# [1] purrr_0.2.1 plyr_1.8.1 lme4_1.1-8 Matrix_1.1-4
#
# loaded via a namespace (and not attached):
# [1] grid_3.1.2 lattice_0.20-29 magrittr_1.5 MASS_7.3-35 minqa_1.2.4 nlme_3.1-118 nloptr_1.0.4
# [8] Rcpp_0.11.3 splines_3.1.2 tools_3.1.2
更新(基于 r2evans 的评论)
在我的两台电脑上options("error")
产生
(函数 () { .rs.breakOnError(TRUE) })()
这似乎是某种 RStudio 默认值,并且确实似乎使 R 在遇到stop()
呼叫时进入浏览器模式(我看到这可以通过菜单工具栏 Debug > On Error > ... 在图形界面中更改)。无论如何,当我设置时options(error = NULL)
,问题就消失了。这是新的(简化的)示例,它工作得很好(在这个最小的示例中以及在应用于实际模拟时):
library(lme4)
library(plyr)
library(purrr)
options(error=NULL)
## Function to generate data
# Generates data that will lead to convergence failure
mini_simulator <- function() {
nb_items <- 10 # observations per subject
nb_subj <- 10 # subjects per group
generate_data <- function() {
A <- rbinom(nb_items * nb_subj, 1, .99)
B <- rbinom(nb_items * nb_subj, 1, .8)
simdata <- data.frame(
Group = rep(c("A", "B"), each = nb_items * nb_subj),
Subj = rep(1 : (nb_subj * 2), each = nb_items),
Items = 1:nb_items,
Response = c(A, B)
)
}
}
## Function to fit model
# Fits model on simulated data with purrr::quietly to capture warnings
# (http://r4ds.had.co.nz/lists.html section "Dealing with failure")
fit_model_quietly <- function(data_sim) {
purrr_out <- purrr::quietly(glmer)(
formula = Response ~ Group + (1|Subj) + (1|Items),
data = data_sim, family = "binomial")
fm <- purrr_out$result
out <- data.frame(summary(fm)$coef)
# keeps track of convergence failures and other warnings
out$Warnings <- paste(unlist(purrr_out$warnings), collapse = "; ")
out
}
# this seed creates the problematic convergence failure on the first evaluation
# of rdply
set.seed(2)
simulations <- plyr::rdply(.n = 3, fit_model_quietly(mini_simulator()()))
simulations