我从 H2O 开始,并尝试在 R f 中集成随机森林和多元线性回归。我使用的 H2O 数据框如下:
summary(training_frame)
HS AS HST AST HF AF
Min. : 3.00 Min. : 2.00 Min. : 0.000 Min. : 0.000 Min. : 3.00 Min. : 1.00
1st Qu.:11.00 1st Qu.: 8.00 1st Qu.: 3.000 1st Qu.: 2.000 1st Qu.:11.00 1st Qu.:11.00
Median :14.00 Median :11.00 Median : 5.000 Median : 4.000 Median :14.00 Median :14.00
Mean :14.44 Mean :11.53 Mean : 5.211 Mean : 4.063 Mean :14.39 Mean :14.03
3rd Qu.:18.00 3rd Qu.:15.00 3rd Qu.: 7.000 3rd Qu.: 5.000 3rd Qu.:17.00 3rd Qu.:17.00
Max. :36.00 Max. :28.00 Max. :18.000 Max. :13.000 Max. :30.00 Max. :27.00
HC AC HY AY HR AR
Min. : 0.000 Min. : 0.000 Min. :0.000 Min. :0.000 Min. :0.0000 Min. :0.0000
1st Qu.: 4.000 1st Qu.: 3.000 1st Qu.:1.000 1st Qu.:2.000 1st Qu.:0.0000 1st Qu.:0.0000
Median : 6.000 Median : 5.000 Median :2.000 Median :3.000 Median :0.0000 Median :0.0000
Mean : 6.421 Mean : 4.824 Mean :2.563 Mean :2.858 Mean :0.1632 Mean :0.2079
3rd Qu.: 8.000 3rd Qu.: 7.000 3rd Qu.:3.000 3rd Qu.:4.000 3rd Qu.:0.0000 3rd Qu.:0.0000
Max. :17.000 Max. :13.000 Max. :8.000 Max. :7.000 Max. :2.0000 Max. :3.0000
dif
Min. :-5.0000
1st Qu.:-1.0000
Median : 0.0000
Mean : 0.5026
3rd Qu.: 2.0000
Max. : 6.0000
然后,我尝试设置两个模型和超级学习器来预测变量“dif”,代码如下:
predictores <- names(X[,-13])
regre.1 <- function(..,family = "gaussian",lambda = 0) h2o.glm.wrapper(..,family = family,lambda = lambda)
randomforest.1 <- function(...,mtries = 5,ntree = 500) h2o.randomForest.wrapper(...,mtries = mtries,ntree = ntree)
h2o.glm.1 <- function(..., family = "gaussian",lambda = 0) h2o.glm.wrapper(..., family = family,lambda = lambda)
learner <- c("regre.1", "randomforest.1")
metalearner <- "h2o.glm.1"
fit <- h2o.ensemble(x = predictores, y = "dif",
training_frame = training_frame,
learner = learner,
metalearner = metalearner,
cvControl = list(V = 5))
但是,我收到此错误消息:
|============================================================================================| 100%
[1] "Cross-validating and training base learner 1: regre.1"
Error in match.fun(learner[l])(y = y, x = x, training_frame = training_frame, :
unused arguments (y = y, x = x, training_frame = training_frame, validation_frame = NULL, fold_column = fold_column, keep_cross_validation_folds = TRUE)
Timing stopped at: 0 0 0
我的代码有什么问题?