我想计算曲线下面积(AUC)和交叉验证(cv)AUC的置信区间mlr3
我了解到,对于回归任务,这可以通过predict_type = "se"
我想知道如何在 AUC/cvAUC 内做到这一点mlr3
在下面的更新中提出了 mlr3 之外的 cvAUC 解决方案)。
示例数据:
# library
library(mlr3verse)
library(mlbench)
# get example data
data(PimaIndiansDiabetes, package="mlbench")
data <- PimaIndiansDiabetes
# make task
all.task <- TaskClassif$new("all.data", data, target = "diabetes")
#make a learner
learner <- lrn("classif.log_reg", predict_type = "prob")
# resample
rr = resample(all.task, learner, rsmp("cv"))
#> INFO [12:19:45.662] [mlr3] Applying learner 'classif.log_reg' on task 'all.data' (iter 5/10)
#> INFO [12:19:45.741] [mlr3] Applying learner 'classif.log_reg' on task 'all.data' (iter 8/10)
#> INFO [12:19:45.780] [mlr3] Applying learner 'classif.log_reg' on task 'all.data' (iter 10/10)
#> INFO [12:19:45.805] [mlr3] Applying learner 'classif.log_reg' on task 'all.data' (iter 2/10)
#> INFO [12:19:45.831] [mlr3] Applying learner 'classif.log_reg' on task 'all.data' (iter 6/10)
#> INFO [12:19:45.859] [mlr3] Applying learner 'classif.log_reg' on task 'all.data' (iter 1/10)
#> INFO [12:19:45.899] [mlr3] Applying learner 'classif.log_reg' on task 'all.data' (iter 9/10)
#> INFO [12:19:45.926] [mlr3] Applying learner 'classif.log_reg' on task 'all.data' (iter 7/10)
#> INFO [12:19:45.954] [mlr3] Applying learner 'classif.log_reg' on task 'all.data' (iter 3/10)
#> INFO [12:19:45.995] [mlr3] Applying learner 'classif.log_reg' on task 'all.data' (iter 4/10)
# get AUC
rr$aggregate(msr("classif.auc"))
#> classif.auc
#> 0.8297186
由reprex 包于 2021-04-02 创建(v1.0.0)
更新:
在外面mlr3
我会用cvAUC
包裹做
library(cvAUC)
library(tidyverse)
# extract predictions
rr$predictions() -> cv_pred_model
# prepare data for cv ci
cv_pred_model %>%
map(.,as.data.table) %>%
map_df(~as.data.frame(.x), .id="fold") -> go
# calculate ci cv
ci.cvAUC(predictions=go$prob.1,labels=go$truth,folds=go$fold,confidence=0.95)