library(caret)
提供了一个非常简单的函数 ( dummyVars
) 来创建虚拟变量,尤其是当您有多个因子变量时。但是您必须确保目标变量是因素。例如,如果您Sales$year
是数字,则必须将它们转换为因子:as.factor(Sales$year)
假设我们有原始数据集“销售”,如下所示:
year Sales Region
1 2010 3695.543 North
2 2010 9873.037 West
3 2008 3579.458 West
4 2005 2788.857 North
5 2005 2952.183 North
6 2008 7255.337 West
7 2005 5237.081 West
8 2010 8987.096 North
9 2008 5545.343 North
10 2008 1809.446 West
现在我们可以同时创建两个虚拟变量:
>library(lattice)
>library(ggplot2)
>library(caret)
>Salesdummy <- dummyVars(~., data = Sales, levelsOnly = TRUE)
>Sdummy <- predict(Salesdummy, Sales)
结果将是:
2005 2008 2010 Sales RegionNorth RegionWest
1 0 0 1 3695.543 1 0
2 0 0 1 9873.037 0 1
3 0 1 0 3579.458 0 1
4 1 0 0 2788.857 1 0
5 1 0 0 2952.183 1 0
6 0 1 0 7255.337 0 1
7 1 0 0 5237.081 0 1
8 0 0 1 8987.096 1 0
9 0 1 0 5545.343 1 0
10 0 1 0 1809.446 0 1