首先,我会给你一些可重现的代码:
library(ggplot2)
y = c(0, 0, 1, 2, 0, 0, 1, 3, 0, 0, 3, 0, 6, 2, 8, 16, 21, 39, 48, 113, 92, 93 ,127, 159, 137, 46, 238, 132 ,124, 185 ,171, 250, 250 ,187, 119 ,151, 292, 94, 281, 146, 163 ,104, 156, 272, 273, 212, 210, 135, 187, 208, 310, 276 ,235, 246, 190, 232, 254, 446,
314, 402 ,276, 279, 386 ,402, 238, 581, 434, 159, 261, 356, 440, 498, 495, 462 ,306, 233, 396, 331, 418, 293 ,431 ,300, 222, 222, 479 ,501, 702
,790, 681)
x = 1:length(y)
现在,我正在尝试为这个数据集构建一个 3 次多项式回归曲线。我想知道这个模型的系数,由summary(lm(formula=y~poly(x,3)))
. 我得到了一个荒谬的结果。
Call:
lm(formula = y ~ poly(x, 3))
Residuals:
Min 1Q Median 3Q Max
-253.696 -47.582 -9.709 44.314 271.183
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 223.978 9.703 23.083 <2e-16 ***
poly(x, 3)1 1420.644 91.538 15.520 <2e-16 ***
poly(x, 3)2 62.375 91.538 0.681 0.497
poly(x, 3)3 130.161 91.538 1.422 0.159
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 91.54 on 85 degrees of freedom
Multiple R-squared: 0.7411, Adjusted R-squared: 0.732
F-statistic: 81.12 on 3 and 85 DF, p-value: < 2.2e-16
这对我的模型来说太高了,我很困惑为什么会返回这个输出。
为什么会这样?我哪里错了?