bagging
我已经实现了两种集成技术,即adaboosting
在 r 中应该适用于任何学习者。
我的网格:
grids <- list(
"knn" = expand.grid(k = c(3, 5, 7, 9, 11, 13, 15))
)
我的变量:
n <- c(2, 4)
boots <- createResample(trainData$BAD, times = 50, list = TRUE)
我的装袋:
for(i in seq_along(grids)) {
method <- names(grids[i])
for(j in 1:nrow(grids[[i]])) {
grid <- data.frame(grids[[i]][j, ])
colnames(grid) <- names(grids[[i]])
# start bagging
bagging <- foreach(k = 1:length(n)) %do% {
predictions <- foreach(m = 1:n[k], .combine = cbind) %do% {
tune <- train(BAD ~ ., data = trainData, method = method, trControl = ctrl, tuneGrid = grid,
metric = "ROC")
pred <- c(predict(tune, newdata = trainData, type = "prob")$BAD,
predict(tune, newdata = testData, type = "prob")$BAD)
}
pred_means <- rowMeans(predictions)
}
resu_bag <- c(resu_bag, unlist(bagging))
}
}
我的改进:
for(i in seq_along(grids)) {
method <- names(grids[i])
for(j in 1:nrow(grids[[i]])) {
grid <- data.frame(grids[[i]][j, ])
colnames(grid) <- names(grids[[i]])
# start boosting
boosting <- foreach(k = 1:length(n)) %do% {
predictions <- foreach(m = 1:n[k], .combine = cbind) %do% {
train_boo <- trainData[boots[[m]], ]
tune <- train(BAD ~ ., data = train_boo, method = method, trControl = ctrl, tuneGrid = grid,
metric = "ROC")
pred <- c(predict(tune, newdata = trainData, type = "prob")$BAD,
predict(tune, newdata = testData, type = "prob")$BAD)
}
pred_means <- rowMeans(predictions)
}
resu_boo <- c(resu_boo, unlist(boosting))
}
}
我的问题:
- 您能否就实施是否正确提出建议?
- 该模型的性能与单个学习器的性能相同,甚至更差。为什么会发生?我做错了什么?
非常感谢!