我一直在mlr
用 Titanic数据集探索这个奇妙的包。我的问题是实现一个随机森林。更具体地说,我想调整cutoff
(即,将不纯的叶子分配给给定类的阈值)。问题是该cutoff
参数采用两个值,但是,我只能找出超参数mlr
为单个值打开。
编码:
library(mlr)
library(dplyr)
dTrain <- read.csv('path/to/data/')
#Defining the Task
trainTask <- makeClassifTask(data = dTrain %>%
select(-Name, -Ticket, -Cabin) %>%
filter(complete.cases(.)),
target = "Survived",
id = "PassengerId")
#Defining Learning
rfLRN <- makeLearner("classif.randomForest")
#Defining the Parameter Space
ps <- makeParamSet(
makeDiscreteParam("cutoff", values = list(c(.5,.5), c(.75,.25)))
)
这是问题所在,cutoff
需要两个值,但是,我不确定如何传递这两个值。上述尝试是错误的。我已经尝试了其他几个参数 makeDiscreteVectorParam
生成器,即等....但无济于事。有小费吗?
相反,如果我尝试调整一个参数,例如mtry
(即在给定拆分处选择的特征数量),一切正常。
#Defining the Hyperparameter Space
ps = makeParamSet(
makeDiscreteParam("mtry", values = c(2,3,4,5))
)
#Defining Resampling
cvTask <- makeResampleDesc("CV", iters=5L)
#Defining Search
search <- makeTuneControlGrid()
#Tune!
tune <- tuneParams(learner = rfLRN
,task = trainTask
,resampling = cvTask
,measures = list(acc)
,par.set = ps
,control = search
,show.info = TRUE)