r - gbm + plyr + doMC 无法在 R 中打开连接

问问题 2017-05-21T03:44:59.440

255 次

我有一个关于美国房价的数据集。数据跨越 50 个不同的州。我想以并行方式为每个州构建一个 GBM。我还想利用 R 包中的cv.folds参数gbm。我想做一个 3 倍的 CV 以获得最佳n.trees价值。

我的代码：

library(gbm)
library(plyr)
library(doMC)
doMC::registerDoMC(cores = detectCores())

gbms = dlply(.data = df, .variables = "State", .fun = function(df_temp) {
    gbm(log(Price) ~ ., 
        data = df_temp[, c(features, outcome)],
        distribution = "gaussian",
        n.trees = 5000,
        shrinkage = 0.001,
        interaction.depth = 3,
        n.minobsinnode = 10,
        bag.fraction = 0.5,
        train.fraction = 0.8,
        cv.folds = 3, # if I turn this to 0, the code runs fine
        keep.data = FALSE
        )
    }, .parallel = TRUE
  )

上面的代码返回以下错误：

Error in do.ply(i) : task 1 failed - "cannot open the connection"

但是，如果我更改cv.folds = 3代码cv.folds = 0运行良好并且我得到了 50 GBM，但它们没有针对n.trees.

请注意，如果我设置，.parallel = FALSE那么代码可以正常工作，但需要很长时间，因为它会在单核上运行。当我尝试使用foreach.

我怎样才能解决这个问题？您的帮助将不胜感激。

r - gbm + plyr + doMC 无法在 R 中打开连接

0 回答 0

Related

Reference