1

我应该如何解释:灵敏度太低,因为插入符号训练交叉验证重采样结果对我训练的数据的 AUC 非常高。

模型性能差吗?

4

1 回答 1

0

它通常发生在存在类不平衡并且默认的 50% 概率截止产生较差的预测时,但类概率虽然校准不佳,但在分类方面做得很好。

这是一个例子:

library(caret)

set.seed(1)
dat <- twoClassSim(500, intercept = 10)

set.seed(2)
mod <- train(Class ~ ., data = dat, method = "svmRadial",
             tuneLength = 10,
             preProc = c("center", "scale"),
             metric = "ROC",
             trControl = trainControl(search = "random",
                                      classProbs = TRUE, 
                                      summaryFunction = twoClassSummary))

结果是

> mod
Support Vector Machines with Radial Basis Function Kernel 

500 samples
 15 predictor
  2 classes: 'Class1', 'Class2' 

Pre-processing: centered (15), scaled (15) 
Resampling: Bootstrapped (25 reps) 
Summary of sample sizes: 500, 500, 500, 500, 500, 500, ... 
Resampling results across tuning parameters:

  sigma       C             ROC        Sens        Spec     
  0.01124608   21.27349102  0.9615725  0.33389177  0.9910125
  0.01330079  419.19384543  0.9579240  0.34620779  0.9914320
  0.01942163   85.16782989  0.9535367  0.33211255  0.9920583
  0.02168484  632.31603140  0.9516538  0.33065224  0.9911863
  0.02395674   89.03035078  0.9497636  0.32504906  0.9909382
  0.03988581    3.58620979  0.9392330  0.25279365  0.9920611
  0.04204420  699.55658836  0.9356568  0.23920635  0.9931667
  0.05263619    0.06127242  0.9265497  0.28134921  0.9839818
  0.05364313   34.57839446  0.9264506  0.19560317  0.9934489
  0.08838604   47.84104078  0.9029791  0.06296825  0.9955034

ROC was used to select the optimal model using  the largest value.
The final values used for the model were sigma = 0.01124608 and C = 21.27349.
于 2016-09-02T20:40:39.423 回答