2

我想用 R 中的 Caret 包中的交互项拟合一个稳健的线性回归,但我得到以下错误:

train.default(x, y, weights = w, ...) 中的错误:正在停止另外:警告消息:在nominalTrainWorkflow(x = x,y = y,wts = weights,info = trainInfo,:缺少值在重新采样的性能指标中。

在我的代码下面:

mod <- train(
Pac ~ clearSkyPOA + clearSkyPOA*TotalCover+Temp2,
data = training,
method = "rlm",
metric = "RMSE",
preProc= c("center","scale","BoxCox"),
trControl =  trainControl(method="cv", number = 5),
na.action =na.omit)

如果我删除交互项:'clearSkyPOA*TotalCover',它会按预期工作。例如用代码:

mod <- train(
    Pac ~ clearSkyPOA + TotalCover+Temp2,
    data = training,
    method = "rlm",
    metric = "RMSE",
    preProc= c("center","scale","BoxCox"),
    trControl =  trainControl(method="cv", number = 5),
    na.action=na.omit
  )

我得到以下结果:

Robust Linear Model 

4363 samples
   3 predictor

Pre-processing: centered (3), scaled (3), Box-Cox transformation (2) 
Resampling: Cross-Validated (5 fold) 
Summary of sample sizes: 3490, 3490, 3491, 3491, 3490 
Resampling results across tuning parameters:

  intercept  psi           RMSE      Rsquared 
  FALSE      psi.huber     291.3261  0.7501889
  FALSE      psi.hampel    291.3261  0.7501889
  FALSE      psi.bisquare  291.3470  0.7499932
   TRUE      psi.huber     115.0178  0.7488397
   TRUE      psi.hampel    114.2018  0.7500523
   TRUE      psi.bisquare  115.4231  0.7483018

RMSE was used to select the optimal model using  the smallest value.
The final values used for the model were intercept = TRUE and psi = psi.hampel. 

我错过了什么吗?以下来自 dput(training) 的 20 个样本的结果:

structure(list(Pac = c(3.42857142857143, 38.25, 120.916666666667, 
258, 367.166666666667, 269.083333333333, 233.75, 112.416666666667, 
21.9166666666667, 0.2, 1.5, 12.4166666666667, 134.916666666667, 
104.333333333333, 394.583333333333, 342.5, 303.333333333333, 
151.5, 42.0833333333333, 4.83333333333333), clearSkyPOA = c(63.0465796511235, 
230.023517163135, 472.935466225438, 646.271261971453, 739.926063392829, 
751.872076941902, 681.91937141018, 531.40317803238, 306.020562749019, 
120.318359249055, 68.2689523552881, 229.800769386719, 473.162397232603, 
647.082096293271, 741.364282016807, 753.955817698295, 684.656233771643, 
534.787114500355, 309.953073794329, 114.55351678131), TotalCover = c(0.602923, 
0.5798824, 0.5095124, 0.3896642, 0.2744389, 0.232004, 0.3052016, 
0.4355463, 0.5392107, 0.5571411, 0.4599758, 0.4555472, 0.4434351, 
0.41583, 0.3704268, 0.306295, 0.2271317, 0.1551105, 0.1170307, 
0.1307881), Temp = c(13.72545, 13.91255, 14.04348, 14.06298, 
13.98118, 13.82455, 13.61805, 13.3806, 13.12966, 12.87026, 12.37558, 
12.76012, 13.12112, 13.37877, 13.5505, 13.67806, 13.7903, 13.86462, 
13.86556, 13.76468), Temp2 = c(188.3879777025, 193.5590475025, 
197.2193305104, 197.7674064804, 195.4733941924, 191.1181827025, 
185.4512858025, 179.04045636, 172.3879717156, 165.6435924676, 
153.1549803364, 162.8206624144, 172.1637900544, 178.9914867129, 
183.61605025, 187.0893253636, 190.17237409, 192.2276877444, 192.2537541136, 
189.4664155024)), .Names = c("Pac", "clearSkyPOA", "TotalCover", 
"Temp", "Temp2"), row.names = c(NA, 20L), class = "data.frame")
4

0 回答 0