r - R播种不“设置”，结果不复制

Question

我有一个看起来像这样的脚本：

#This is the master script.  It runs all other scripts.
rm(list=ls()) 

#Run data cleaing script
source("datacleaning.R")

set.seed(413) #Seed pre-selected as lead author's wife's birthday (April 13th)
reps=128

#Make imputated datasets
source("makeimps.R")

#Model selection step 1.  
source("model_selection.1.R")
load("AIC_results.1")
AIC_results

#best model removed the year interaction

#Model selection step 2.  removed year interaction
source("model_selection.2.R")
load("AIC_results.2")
AIC_results

#all interactions pretty good.  keeping this model

#Final selected model:
source("selectedmodel.R")

我将这个主脚本发送到一个超级计算集群；在 32 个内核上大约需要 17 小时的 CPU 时间和 40 分钟的 walltime。（因此我的不可复制的例子）。但是当我运行脚本时，查看结果，然后再次运行，再查看结果，它们略有不同。为什么？我种下了种子！种子会以某种方式重置吗？我是否需要在每个脚本文件中指定种子？

我需要增加代表次数，因为很明显我还没有充分收敛。但这是一个单独的问题。为什么我在这里的结果不能自我复制，我该如何解决？

提前致谢。

编辑：我正在通过doMCand进行并行化plyr。基于以下评论的一些简单的谷歌搜索发现了一个事实，即使用这些包无法真正设置“并行种子”。我需要以SNOW某种方式将我的代码迁移到。如果有人知道 and 的解决方案doMC，plyr我将不胜感激了解它是什么。

score 2 · Accepted Answer

查看doRNG专门为这种可重现的并行计算开发的包。在循环调用中设置种子，您将能够准确地重现您的结果......

require(doParallel)
require(doRNG)
cl <- makeCluster(4)
registerDoParallel(cl)


unlist( foreach( i = 1:4 , .options.RNG = 413 ) %dorng% { runif(1) } )
#[1] 0.5251507 0.4326805 0.6409496 0.5523651

unlist( foreach( i = 1:4 , .options.RNG = 413 ) %dorng% { runif(1) } )
#[1] 0.5251507 0.4326805 0.6409496 0.5523651

r - R播种不“设置”，结果不复制

1 回答 1

Related

Reference