我应该如何解释:灵敏度太低,因为插入符号训练交叉验证重采样结果对我训练的数据的 AUC 非常高。
模型性能差吗?
它通常发生在存在类不平衡并且默认的 50% 概率截止产生较差的预测时,但类概率虽然校准不佳,但在分类方面做得很好。
这是一个例子:
library(caret)
set.seed(1)
dat <- twoClassSim(500, intercept = 10)
set.seed(2)
mod <- train(Class ~ ., data = dat, method = "svmRadial",
tuneLength = 10,
preProc = c("center", "scale"),
metric = "ROC",
trControl = trainControl(search = "random",
classProbs = TRUE,
summaryFunction = twoClassSummary))
结果是
> mod
Support Vector Machines with Radial Basis Function Kernel
500 samples
15 predictor
2 classes: 'Class1', 'Class2'
Pre-processing: centered (15), scaled (15)
Resampling: Bootstrapped (25 reps)
Summary of sample sizes: 500, 500, 500, 500, 500, 500, ...
Resampling results across tuning parameters:
sigma C ROC Sens Spec
0.01124608 21.27349102 0.9615725 0.33389177 0.9910125
0.01330079 419.19384543 0.9579240 0.34620779 0.9914320
0.01942163 85.16782989 0.9535367 0.33211255 0.9920583
0.02168484 632.31603140 0.9516538 0.33065224 0.9911863
0.02395674 89.03035078 0.9497636 0.32504906 0.9909382
0.03988581 3.58620979 0.9392330 0.25279365 0.9920611
0.04204420 699.55658836 0.9356568 0.23920635 0.9931667
0.05263619 0.06127242 0.9265497 0.28134921 0.9839818
0.05364313 34.57839446 0.9264506 0.19560317 0.9934489
0.08838604 47.84104078 0.9029791 0.06296825 0.9955034
ROC was used to select the optimal model using the largest value.
The final values used for the model were sigma = 0.01124608 and C = 21.27349.