2

我正在尝试通过将 control 设置为来构建完整的树rpart.control(minsplit=2, minbucket = 1,cp=0),但它不起作用。我认为原因可能是具有 4 个拆分的摘要树cp= 0,但是这棵树不完整,所以它cp应该 > 0。
我还检查了数据,并且可以进行更多拆分。这是我的代码:

#################
# libraries #####
library(datasets)
library(rpart)
library(rpart.plot)
##################
# preparing data #
titanic_obs=c()
for (cl in c("1st", "2nd", "3rd", "Crew")) {
  for (se in c("Male","Female")) {
    for (ag in c("Child","Adult")) {
      for (sur in c("Yes","No")) {
        titanic_obs = rbind(titanic_obs,matrix(rep(c(cl,se,ag,sur),length.out=4*Titanic[cl,se,ag,sur]),ncol=4,byrow=T))    
      }
    }
  }
}

colnames(titanic_obs)= c("Class", "Sex", "Age","Survived")
titanic_data = data.frame(titanic_obs)
summary(titanic_data) 
#################
# fitting model #
titanic_rpart = rpart(Survived ~ Sex + Age + Class,
                  data = titanic_data,method="class",
                  control=rpart.control(minsplit=2, minbucket = 1,cp=0))
#################
# checking ######
summary(titanic_rpart)
prp(titanic_rpart, extra=1, uniform=F, branch=1, yesno=F, border.col=0, xsep="/")
#################
# data ##########
adult_men = titanic_data[titanic_data$Sex=="Male" & titanic_data$Age=="Adult",]
all_am = table(adult_men$Class)
    survived_am = table(adult_men[adult_men$Survived=="Yes",]$Class)
survived_am/all_am
4

2 回答 2

0

如对此问题的评论所示,设置cp=-1将构建完整的树。

于 2018-09-24T12:41:39.310 回答
-1

现在无法检查,但我似乎记得设置 cp=0.000001 或类似的小数字在某些时候为我解决了这个问题。另请注意,诸如 minsplit 和 minbucket 之类的参数可能会阻碍树的生长,因此您可能还需要为它们设置适当的值。

于 2014-06-10T20:42:10.077 回答