我一直在尝试使用 caret 包应用递归特征选择。我需要的是 ref 使用 AUC 作为性能指标。谷歌搜索一个月后,我无法让该过程正常工作。这是我使用的代码:
library(caret)
library(doMC)
registerDoMC(cores = 4)
data(mdrr)
subsets <- c(1:10)
ctrl <- rfeControl(functions=caretFuncs,
method = "cv",
repeats =5, number = 10,
returnResamp="final", verbose = TRUE)
trainctrl <- trainControl(classProbs= TRUE)
caretFuncs$summary <- twoClassSummary
set.seed(326)
rf.profileROC.Radial <- rfe(mdrrDescr, mdrrClass, sizes=subsets,
rfeControl=ctrl,
method="svmRadial",
metric="ROC",
trControl=trainctrl)
执行此脚本时,我得到以下结果:
Recursive feature selection
Outer resampling method: Cross-Validation (10 fold)
Resampling performance over subset size:
Variables Accuracy Kappa AccuracySD KappaSD Selected
1 0.7501 0.4796 0.04324 0.09491
2 0.7671 0.5168 0.05274 0.11037
3 0.7671 0.5167 0.04294 0.09043
4 0.7728 0.5289 0.04439 0.09290
5 0.8012 0.5856 0.04144 0.08798
6 0.8049 0.5926 0.02871 0.06133
7 0.8049 0.5925 0.03458 0.07450
8 0.8124 0.6090 0.03444 0.07361
9 0.8181 0.6204 0.03135 0.06758 *
10 0.8069 0.5971 0.04234 0.09166
342 0.8106 0.6042 0.04701 0.10326
The top 5 variables (out of 9):
nC, X3v, Sp, X2v, X1v
该过程始终使用准确性作为性能指标。出现的另一个问题是,当我尝试从使用以下方法获得的模型中获得预测时:
predictions <- predict(rf.profileROC.Radial$fit,mdrrDescr)
我收到以下消息
In predictionFunction(method, modelFit, tempX, custom = models[[i]]$control$custom$prediction) :
kernlab class prediction calculations failed; returning NAs
事实证明,从模型中得到一些预测是不可能的。
以下是通过获取的信息sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8
[4] LC_COLLATE=es_ES.UTF-8 LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=es_ES.UTF-8
[7] LC_PAPER=es_ES.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid parallel splines stats graphics grDevices utils datasets methods base
other attached packages:
[1] e1071_1.6-2 class_7.3-9 pROC_1.6.0.1 doMC_1.3.2 iterators_1.0.6 foreach_1.4.1
[7] caret_6.0-21 ggplot2_0.9.3.1 lattice_0.20-24 kernlab_0.9-19
loaded via a namespace (and not attached):
[1] car_2.0-19 codetools_0.2-8 colorspace_1.2-4 compiler_3.0.2 dichromat_2.0-0
[6] digest_0.6.4 gtable_0.1.2 labeling_0.2 MASS_7.3-29 munsell_0.4.2
[11] nnet_7.3-7 plyr_1.8 proto_0.3-10 RColorBrewer_1.0-5 Rcpp_0.10.6
[16] reshape2_1.2.2 scales_0.2.3 stringr_0.6.2 tools_3.0.2