1

ml 教程的这一部分:https : //mlr.mlr-org.com/articles/tutorial/nested_resampling.html#filter-methods-with-tuning 解释了如何使用 TuneWrapper 和 FilterWrapper 来调整过滤器的阈值. 但是,如果我的过滤器也有需要调整的超参数,例如随机森林变量重要性过滤器,该怎么办?除了阈值,我似乎无法调整任何参数。

例如:

library(survival)
library(mlr)

data(veteran)
set.seed(24601)
task_id = "MAS"
mas.task <- makeSurvTask(id = task_id, data = veteran, target = c("time", "status"))
mas.task <- createDummyFeatures(mas.task)
tuning = makeResampleDesc("CV", iters=5, stratify=TRUE)                             # Tuning: 5-fold CV, no repeats

cox.filt.rsfrc.lrn = makeTuneWrapper(
      makeFilterWrapper(
        makeLearner(cl="surv.coxph", id = "cox.filt.rfsrc", predict.type="response"), 
        fw.method="randomForestSRC_importance",
        cache=TRUE,
        ntree=2000
      ), 
      resampling = tuning, 
      par.set = makeParamSet(
          makeIntegerParam("fw.abs", lower=2, upper=10),
          makeIntegerParam("mtry", lower = 5, upper = 15),
          makeIntegerParam("nodesize", lower=3, upper=25)
      ), 
      control = makeTuneControlRandom(maxit=20),
      show.info = TRUE)

产生错误消息:

checkTunerParset(learner, par.set, measure, control) 出错:只能调整存在学习器参数的参数:mtry,nodesize

有没有办法调整随机森林的超参数?

编辑:根据评论中的建议进行的其他尝试:

  1. 在馈送到过滤器之前将调谐器包裹在基础学习器周围(过滤器未显示) - 失败

    cox.lrn =  makeLearner(cl="surv.coxph", id = "cox.filt.rfsrc", predict.type="response")
    cox.tune = makeTuneWrapper(cox.lrn, 
                       resampling = tuning, 
                       measures=list(cindex),
                       par.set = makeParamSet(
                         makeIntegerParam("mtry", lower = 5, upper = 15),
                         makeIntegerParam("nodesize", lower=3, upper=25),
                         makeIntegerParam("fw.abs", lower=2, upper=10)
                       ),
                       control = makeTuneControlRandom(maxit=20),
                       show.info = TRUE)
    
    Error in checkTunerParset(learner, par.set, measures, control) : 
    Can only tune parameters for which learner parameters exist: mtry,nodesize,fw.abs
    
  2. 两个级别的调整 - 失败

    cox.lrn =  makeLearner(cl="surv.coxph", id = "cox.filt.rfsrc", predict.type="response")
    cox.filt = makeFilterWrapper(cox.lrn,
                         fw.method="randomForestSRC_importance",
                         cache=TRUE,
                         ntree=2000)
    cox.tune = makeTuneWrapper(cox.filt, 
                       resampling = tuning, 
                       measures=list(cindex),
                       par.set = makeParamSet(
                         makeIntegerParam("fw.abs", lower=2, upper=10)
                       ),
                       control = makeTuneControlRandom(maxit=20),
                       show.info = TRUE)
    
    cox.tune2 = makeTuneWrapper(cox.tune, 
                       resampling = tuning, 
                       measures=list(cindex),
                       par.set = makeParamSet(
                         makeIntegerParam("mtry", lower = 5, upper = 15),
                         makeIntegerParam("nodesize", lower=3, upper=25)
                       ),
                       control = makeTuneControlRandom(maxit=20),
                       show.info = TRUE)
    
    Error in makeBaseWrapper(id, learner$type, learner, learner.subclass = c(learner.subclass,  : 
      Cannot wrap a tuning wrapper around another optimization wrapper!
    
4

1 回答 1

2

看起来您目前无法调整过滤器的超参数。您可以通过传入某些参数来手动更改它们,makeFilterWrapper()但不能对其进行调整。您只能调整其中一个fw.absfw.percfw.tresh在过滤时调整。

我不知道在对 RandomForest 过滤器使用不同的超标准时,对排名的影响有多大。检查稳健性的一种方法是mtrygetFeatureImportance(). 如果这些之间存在非常高的秩相关,您可以放心地忽略 RF 滤波器的调谐。(也许您想使用完全不存在此问题的其他过滤器?)

如果您坚持拥有此功能,您可能需要为该软件包提高 PR :)

lrn = makeLearner(cl = "surv.coxph", id = "cox.filt.rfsrc", predict.type = "response")

filter_wrapper = makeFilterWrapper(
  lrn,
  fw.method = "randomForestSRC_importance",
  cache = TRUE,
  ntrees = 2000
)

cox.filt.rsfrc.lrn = makeTuneWrapper(
  filter_wrapper,
  resampling = tuning,
  par.set = makeParamSet(
    makeIntegerParam("fw.abs", lower = 2, upper = 10)
  ),
  control = makeTuneControlRandom(maxit = 20),
  show.info = TRUE)
于 2019-09-19T07:47:40.283 回答