r - TuneRanger 中的重复 CV

Question

我正在使用包“TuneRanger”来调整射频模型。它工作得很好，我得到了很好的结果，但我不确定它是否过度拟合了我的模型。我想为包正在调整模型的每个实例使用重复 CV，但我找不到方法。另外我想知道是否有人知道该软件包如何验证每次尝试的结果（训练测试、简历、重复简历？）我一直在阅读软件包的说明（https://cran.r-project.org /web/packages/tuneRanger/tuneRanger.pdf）但它什么也没说。

谢谢您的帮助。

score 1 · Accepted Answer

袋外估计用于估计错误，我认为您不能使用该包切换到 CV。CV是否比这更好由您决定。在他们的自述文件中，他们链接到一个出版物，并在第 3.5 节中写道：

袋外预测用于评估，这使得它比其他使用评估策略（如交叉验证）的包快得多

如果要使用交叉验证或重复交叉验证，则必须使用caret，例如：

library(caret)

mdl = train(Species ~ .,data=iris,method="ranger",trControl=trainControl(method="repeatedcv",repeats=2),
tuneGrid = expand.grid(mtry=2:3,min.node.size = 1:2,splitrule="gini"))

Random Forest 

150 samples
  4 predictor
  3 classes: 'setosa', 'versicolor', 'virginica' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 2 times) 
Summary of sample sizes: 135, 135, 135, 135, 135, 135, ... 
Resampling results across tuning parameters:

  mtry  min.node.size  Accuracy  Kappa
  2     1              0.96      0.94 
  2     2              0.96      0.94 
  3     1              0.96      0.94 
  3     2              0.96      0.94 

Tuning parameter 'splitrule' was held constant at a value of gini
Accuracy was used to select the optimal model using the largest value.
The final values used for the model were mtry = 2, splitrule = gini
 and min.node.size = 1.

您可以调整的参数会有所不同。我认为mlr还允许您执行交叉验证，但同样的限制适用。

r - TuneRanger 中的重复 CV

1 回答 1

Related

Reference