如果您想在某种意义上“标准化”,您可以使用scale
将 std.dev 居中并将其设置为 1 的功能。
> scale( mym )
(Intercept) a b c
1 NaN -1 -1 -1
2 NaN 0 0 0
3 NaN 1 1 1
attr(,"assign")
[1] 0 1 2 3
attr(,"scaled:center")
(Intercept) a b c
1 2 2 2
attr(,"scaled:scale")
(Intercept) a b c
0 1 1 1
> mym
(Intercept) a b c
1 1 1 1 1
2 1 2 2 2
3 1 3 3 3
attr(,"assign")
[1] 0 1 2 3
如您所见,当存在“截距”项时,将所有模型矩阵“归一化”并没有真正意义。所以你可以这样做:
> mym[ , -1 ] <- scale( mym[,-1] )
> mym
(Intercept) a b c
1 1 -1 -1 -1
2 1 0 0 0
3 1 1 1 1
attr(,"assign")
[1] 0 1 2 3
如果您的默认对比选项设置为“contr.sum”并且列是因子类型,这实际上是模型矩阵。model.matrix
如果要“标准化”的变量是因子,则这仅被接受为内部操作:
> mym <- model.matrix(as.formula("~ a + b + c"), mydf, contrasts.arg=list(a="contr.sum"))
Error in `contrasts<-`(`*tmp*`, value = contrasts.arg[[nn]]) :
contrasts apply only to factors
> mydf <- data.frame(a = factor(c(1,2,3)), b = c(1,2,3), c = c(1,2,3))
> mym <- model.matrix(as.formula("~ a + b + c"), mydf, contrasts.arg=list(a="contr.sum"))
> mym
(Intercept) a1 a2 b c
1 1 1 0 1 1
2 1 0 1 2 2
3 1 -1 -1 3 3
attr(,"assign")
[1] 0 1 1 2 3
attr(,"contrasts")
attr(,"contrasts")$a
[1] "contr.sum"