我正在尝试手动进行 10 倍交叉验证。我的数据集称为spam
.
我的代码如下:
n <- nrow(spam) #4600 rows in spam data set
ncp <- length(spam.rpart2$cptable[,"CP"]) #20 CP values
group <- rep(1:10,ceiling(n/10))[1:n] #fill 4600 values with 1 to 10
permid <- sample(1:n) #permute numbers
cvtable <- matrix(NA, n, ncp)
for(j in 1:20) {
for(i in 1:10) {
trainingset <- permid[group!=i]
testset <- permid[group==i]
spam.rpart.test <- rpart(spam ~ .,
method = "class",
cp = spam.rpart2$cptable[j,"CP"],
data = spam[trainingset,])
cvtable[testset,j] <- predict(spam.rpart.test,
data=spam[testset,])[,1]
#incorrect dimensions!
}
}
但是,我在倒数第三行遇到了麻烦。预测值应该只预测 460 个值,但它给了我 4160 个值,因此 for 循环代码没有运行。我收到以下错误:
Error in cvtable[testset, j] <- predict(spam.rpart.test, data = spam[testset, :
number of items to replace is not a multiple of replacement length