3

我一直在尝试在 R 中运行并行化的 foreach 循环,它可以正常运行大约十次迭代,但随后崩溃,显示错误:

Error in { : task 7 failed - "missing value where TRUE/FALSE needed"
Calls: %dopar% -> <Anonymous>
Execution halted

我将每个循环的结果附加到一个文件中,该文件确实显示了预期的输出。我的脚本如下,使用这篇文章中的 combn_sub 函数:

LBRA <- fread(
 input      = "LBRA.012",
 data.table = FALSE)
str_bra <- nrow(LBRA)

br1sums <- colSums(LBRA)
b1non <- which(br1sums == 0)

LBRA_trim <- LBRA[,-b1non]

library(foreach)
library(doMC)
registerDoMC(28)

foreach(X = seq(2, (nrow(LBRA)-1))) %dopar% {
  com <- combn_sub(
   x    = nrow(LBRA),
   m    = X,
   nset = 1000)

  out_in <- matrix(
   ncol = 2,
   nrow = 1)
   colnames(out) <- c("SNPs", "k")

    for (A in seq(1, ncol(com))){
      rowselect <- com[, A]

      sub <- LBRA_trim[rowselect, ]
      subsum <- colSums(sub)

      length <- length(which(subsum != 0)) - 1
      out_in <- rbind(out_in, c(length, X))
    }

  write.table(
   file   = "plateau.csv",
   sep    = "\t",
   x      = out_in,
   append = TRUE)
}
4

2 回答 2

3

我的 foreach 电话也有类似的问题......

tmpcol <- foreach(j = idxs:idxe, .combine=cbind) %dopar% { imp(j) }

Error in { : task 18 failed - "missing value where TRUE/FALSE needed"

更改 .errorhandling 参数只会忽略错误

tmpcol <- foreach(j = idxs:idxe, .combine=cbind, .errorhandling="pass") %dopar% { imp(j) }

Warning message:
In fun(accum, result.18) :
  number of rows of result is not a multiple of vector length (arg 2)

我建议在你的 foreach 调用中运行 X=7 的函数。我的问题是我的函数 imp(j) 抛出了一个错误(对于 j=18,它在 NA 计算上遇到问题),导致 foreach 的输出模糊。

于 2016-10-24T15:52:57.903 回答
1

foreach正如@Roland 指出的那样,在循环中写入文件是一个非常糟糕的主意。即使在append模式下写入,各个内核也会尝试同时写入文件,并且可能会破坏彼此的输入。相反,使用选项捕获foreach语句的结果,.combine="rbind"然后在循环后写入文件:

cluster <- makeCluster(28, outfile="MulticoreLogging.txt");
registerDoMc(cluster);

foreach_outcome_table <- foreach(X = seq(2, (nrow(LBRA)-1)), .combine="rbind") %dopar% {

  print(cat(paste(Sys.info()[['nodename']], Sys.getpid(), sep='-'), "now performing loop", X, "\n"));

  com <- combn_sub(x = nrow(LBRA), m = X, nset = 1000);

  out_in <- matrix(ncol = 2,nrow = 1);

  colnames(out_in) <- c("SNPs", "k");

  for (A in seq(1, ncol(com))){
    rowselect <- com[, A];

    sub <- LBRA_trim[rowselect, ];
    subsum <- colSums(sub);

    length <- length(which(subsum != 0)) - 1;
    out_in <- rbind(out_in, c(length, X));
  }
  out_in;
}
write.table(file = "plateau.csv",sep = "\t", x = foreach_outcome_table, append = TRUE);

此外,您可以用嵌套的 foreach 循环for替换内部循环,这可能会更有效。

于 2016-02-08T16:40:00.720 回答