我想用 R 中的 Caret 包中的交互项拟合一个稳健的线性回归,但我得到以下错误:
train.default(x, y, weights = w, ...) 中的错误:正在停止另外:警告消息:在nominalTrainWorkflow(x = x,y = y,wts = weights,info = trainInfo,:缺少值在重新采样的性能指标中。
在我的代码下面:
mod <- train(
Pac ~ clearSkyPOA + clearSkyPOA*TotalCover+Temp2,
data = training,
method = "rlm",
metric = "RMSE",
preProc= c("center","scale","BoxCox"),
trControl = trainControl(method="cv", number = 5),
na.action =na.omit)
如果我删除交互项:'clearSkyPOA*TotalCover',它会按预期工作。例如用代码:
mod <- train(
Pac ~ clearSkyPOA + TotalCover+Temp2,
data = training,
method = "rlm",
metric = "RMSE",
preProc= c("center","scale","BoxCox"),
trControl = trainControl(method="cv", number = 5),
na.action=na.omit
)
我得到以下结果:
Robust Linear Model
4363 samples
3 predictor
Pre-processing: centered (3), scaled (3), Box-Cox transformation (2)
Resampling: Cross-Validated (5 fold)
Summary of sample sizes: 3490, 3490, 3491, 3491, 3490
Resampling results across tuning parameters:
intercept psi RMSE Rsquared
FALSE psi.huber 291.3261 0.7501889
FALSE psi.hampel 291.3261 0.7501889
FALSE psi.bisquare 291.3470 0.7499932
TRUE psi.huber 115.0178 0.7488397
TRUE psi.hampel 114.2018 0.7500523
TRUE psi.bisquare 115.4231 0.7483018
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were intercept = TRUE and psi = psi.hampel.
我错过了什么吗?以下来自 dput(training) 的 20 个样本的结果:
structure(list(Pac = c(3.42857142857143, 38.25, 120.916666666667,
258, 367.166666666667, 269.083333333333, 233.75, 112.416666666667,
21.9166666666667, 0.2, 1.5, 12.4166666666667, 134.916666666667,
104.333333333333, 394.583333333333, 342.5, 303.333333333333,
151.5, 42.0833333333333, 4.83333333333333), clearSkyPOA = c(63.0465796511235,
230.023517163135, 472.935466225438, 646.271261971453, 739.926063392829,
751.872076941902, 681.91937141018, 531.40317803238, 306.020562749019,
120.318359249055, 68.2689523552881, 229.800769386719, 473.162397232603,
647.082096293271, 741.364282016807, 753.955817698295, 684.656233771643,
534.787114500355, 309.953073794329, 114.55351678131), TotalCover = c(0.602923,
0.5798824, 0.5095124, 0.3896642, 0.2744389, 0.232004, 0.3052016,
0.4355463, 0.5392107, 0.5571411, 0.4599758, 0.4555472, 0.4434351,
0.41583, 0.3704268, 0.306295, 0.2271317, 0.1551105, 0.1170307,
0.1307881), Temp = c(13.72545, 13.91255, 14.04348, 14.06298,
13.98118, 13.82455, 13.61805, 13.3806, 13.12966, 12.87026, 12.37558,
12.76012, 13.12112, 13.37877, 13.5505, 13.67806, 13.7903, 13.86462,
13.86556, 13.76468), Temp2 = c(188.3879777025, 193.5590475025,
197.2193305104, 197.7674064804, 195.4733941924, 191.1181827025,
185.4512858025, 179.04045636, 172.3879717156, 165.6435924676,
153.1549803364, 162.8206624144, 172.1637900544, 178.9914867129,
183.61605025, 187.0893253636, 190.17237409, 192.2276877444, 192.2537541136,
189.4664155024)), .Names = c("Pac", "clearSkyPOA", "TotalCover",
"Temp", "Temp2"), row.names = c(NA, 20L), class = "data.frame")