我目前正在处理 MMST 包中的葡萄酒数据。我已将整个数据集拆分为训练和测试,并构建了一个类似于以下代码的树:
library("rpart")
library("gbm")
library("randomForest")
library("MMST")
data(wine)
aux <- c(1:178)
train_indis <- sample(aux, 142, replace = FALSE)
test_indis <- setdiff(aux, train_indis)
train <- wine[train_indis,]
test <- wine[test_indis,] #### divide the dataset into trainning and testing
model.control <- rpart.control(minsplit = 5, xval = 10, cp = 0)
fit_wine <- rpart(class ~ MalicAcid + Ash + AlcAsh + Mg + Phenols + Proa + Color + Hue + OD + Proline, data = train, method = "class", control = model.control)
windows()
plot(fit_wine,branch = 0.5, uniform = T, compress = T, main = "Full Tree: without pruning")
text(fit_wine, use.n = T, all = T, cex = .6)
我可以得到这样的图像:
每个节点下的数字(例如 Grigolino 下的 0/1/48)是什么意思?如果我想知道每个节点有多少训练和测试样本,我应该在代码中写什么?