我正在尝试对我用 gamlss 包估计的模型执行 5 折交叉验证。当我使用相同的代码并估计另一个模型(例如 OLS)时,我没有问题。但是,当我将模型更改为 gamlss 时,我收到一条错误消息。
这是一个说明性示例:
# load packages and data
library(caret)
library(gamlss)
data(usair)
# create 5 folds
folds <- createFolds(usair$y, k = 5)
当我运行这段代码时,一切正常,我得到一个列表,其中包含我对每个折叠的性能度量:
### 1) OLS
# estimate model 5 times and get performance measures
res1 <- lapply(folds, function(x) {
# Create training and test data set
trainset <- usair[-x, ]
testset <- usair[x, ]
# estimate the model with the training data set
m1<- lm(y~ x1 + x2 + x3 + x4 + x5 + x6,
data=trainset)
# predict outcomes with the test data set
y_pred <- predict(m1, newdata = testset)
# store the actual outcome values in a vector
y_true <- testset$y
# Store performance measures
MAE <- sum(abs(y_true-y_pred))/length(y_true) # Mean Absolute Error
MSE <- sum((y_true-y_pred)^2)/length(y_true) # Mean Squared Error
MAPE <- 100*sum(abs(y_true-y_pred)/y_true)/length(y_true) # Mean Absolute Percentage Error
R2 <- 1-MSE/var(y_true)
list(MAE=MAE,
MSE=MSE,
MAPE=MAPE,
R2= R2)
})
但是,当我运行此代码并将模型类型更改为 gamlss 时,我收到一条错误消息:
### 2) gamlss
# estimate model 5 times and get performance measures
res2 <- lapply(folds, function(x) {
# Create training and test data set
trainset <- usair[-x, ]
testset <- usair[x, ]
# estimate the model with the training data set
m1<- gamlss(y~ri(x.vars=c("x1","x2","x3","x4","x5","x6"), Lp =1),
data=trainset)
# predict outcomes with the test data set
y_pred <- predict(m1, newdata = testset)
# store the actual outcome values in a vector
y_true <- testset$y
# Store performance measures
MAE <- sum(abs(y_true-y_pred))/length(y_true) # Mean Absolute Error
MSE <- sum((y_true-y_pred)^2)/length(y_true) # Mean Squared Error
MAPE <- 100*sum(abs(y_true-y_pred)/y_true)/length(y_true) # Mean Absolute Percentage Error
R2 <- 1-MSE/var(y_true)
list(MAE=MAE,
MSE=MSE,
MAPE=MAPE,
R2= R2)
})
错误消息是:“评估错误(替代(数据)):找不到对象'trainset'”。我已经为每个折叠分别运行函数中的代码并且它可以工作。似乎无法再创建训练集和测试集了。然而,我所做的只是改变模型。
有谁知道这里可能是什么问题?