11

gbm在尝试拟合或rpart模型时,我多次遇到此错误。最后,我能够使用公开可用的数据始终如一地重现它。我注意到使用 CV(或重复 cv)时会发生此错误。当我不使用任何适合控件时,我不会收到此错误。有人能解释一下为什么我总是不断出错。

fitControl= trainControl("repeatedcv", repeats=5)
ds = read.csv("http://www.math.smith.edu/r/data/help.csv")
ds$sub = as.factor(ds$substance)
rpartFit1 <- train(homeless ~ female + i1 + sub + sexrisk + mcs + pcs, 
                   tcControl=fitControl, 
                   method = "rpart", 
                   data=ds)
4

1 回答 1

1

有错别字,应该trControltcControl. 并且当参数提供为 时tcControlcaret将其传递给 rpart 并引发错误,因为此选项永远不可用。

我想这回答了您的问题,即当您尝试在训练中进行交叉验证时为什么会出现此错误。

以下是它应该如何工作:

library(caret)
library(mosaicData)

data(HELPrct)
ds = HELPrct
fitControl= trainControl(method="repeatedcv",times=5)
ds$sub = as.factor(ds$substance)

rpartFit1 <- train(homeless ~ female + i1 + sub + sexrisk + mcs + pcs, 
                   trControl=fitControl, 
                   method = "rpart", 
                   data=ds[complete.cases(ds),])

rpartFit1
CART 

117 samples
  6 predictor
  2 classes: 'homeless', 'housed' 

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 105, 105, 105, 106, 105, 106, ... 
Resampling results across tuning parameters:

  cp          Accuracy   Kappa      
  0.00000000  0.5280303  -0.03503032
  0.01190476  0.5280303  -0.03503032
  0.07142857  0.5977273  -0.02970604

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was cp = 0.07142857.
于 2020-06-25T20:25:19.073 回答