1

我无法将数据框导出到%dopar%foreach 包中。如果我与 with%do%一起使用,它会起作用registerDoSEQ(),但registerDoParallel()我总是得到:

Error in { : task 1 failed - "object 'kyphosis' not found"

kyphosis这是一个使用包中的数据的可重现示例rpart。我正在尝试将逐步回归并行化一点:

library(doParallel)
library(foreach)
library(rpart)

invars <- c('Age', 'Number', 'Start')
n_vars <- 2
vars <- length(invars)
iter <- trunc(vars/n_vars)
threads <- 4
if (vars%%n_vars == 0) iter <- iter - 1
iter <- 0:iter

cl <- makeCluster(threads)
registerDoParallel(cl)
#registerDoSEQ()

terms <- ''
min_formula <- paste0('Kyphosis~ 1', terms)
fit <- glm(formula = as.formula(min_formula), data = kyphosis, family = 'binomial')

out <- foreach(x = iter, .export = 'kyphosis') %dopar%  {

  nv <- invars[(x * n_vars + 1):(min(x * n_vars + n_vars, vars))]
  sfit <- step(object = fit, trace =FALSE, scope = list(
    lower = min_formula,
    upper = as.formula(paste(min_formula, '+', paste0(nv, collapse = '+')))),
    steps = 1, direction = 'forward')
  aic <- sfit$aic

  names(aic) <- if(nrow(sfit$anova) == 2) sfit$anova$Step[2]
  aic
}
out
stopCluster(cl)
4

1 回答 1

1

foreach在调用step函数之前添加这个:

.GlobalEnv$kyphosis <- kyphosis

我不确定为什么会发生这种情况,但我的直觉是,它使用存储在中的信息step调用自身内部,即glmfit$call

glm(formula = as.formula(min_formula), family = "binomial", data = kyphosis)

使用新的更新公式,但部分 data = kyphosis保持不变。所以试图在全球环境glm中寻找。kyphosis

于 2017-07-27T22:23:46.807 回答