您可以使用dcast
“reshape2”。
library(reshape2)
dcast(mydf, customerid + store + Date ~ product, value.var="Sales")
# customerid store Date A B C D
# 1 1 x1 1/16/2013 4 2 2 NA
# 2 1 X1 1/2/2013 4 NA NA NA
# 3 1 x2 1/9/2013 4 4 NA NA
# 4 2 x1 1/23/2013 2 NA NA NA
# 5 2 x1 2/6/2013 NA NA 2 NA
# 6 2 x2 1/30/2013 NA 2 NA NA
# 7 2 x3 2/13/2013 NA NA NA 4
如果您想使用 "" 而不是NA
,您也可以这样做,但请注意,您会将这些列强制转换为character
.
dcast(mydf, customerid + store + Date ~ product, value.var="Sales", fill="")
# customerid store Date A B C D
# 1 1 x1 1/16/2013 4 2 2
# 2 1 X1 1/2/2013 4
# 3 1 x2 1/9/2013 4 4
# 4 2 x1 1/23/2013 2
# 5 2 x1 2/6/2013 2
# 6 2 x2 1/30/2013 2
# 7 2 x3 2/13/2013 4
对于基本 R 解决方案,您可以使用reshape()
:
reshape(mydf, direction = "wide",
idvar = c("customerid", "store", "Date"),
timevar = "product")
# customerid store Date Sales.A Sales.B Sales.C Sales.D
# 1 1 X1 1/2/2013 4 NA NA NA
# 2 1 x2 1/9/2013 4 4 NA NA
# 4 1 x1 1/16/2013 4 2 2 NA
# 7 2 x1 1/23/2013 2 NA NA NA
# 8 2 x2 1/30/2013 NA 2 NA NA
# 9 2 x1 2/6/2013 NA NA 2 NA
# 10 2 x3 2/13/2013 NA NA NA 4
另一种可能性是使用model.matrix
(感谢@Thomasmodel.matrix
在最近的问答中解释该方法):
# cbind(mydf, model.matrix(~ 0 + product, data = mydf) * mydf$Sales)
# customerid product store Date Sales productA productB productC productD
# 1 1 A X1 1/2/2013 4 4 0 0 0
# 2 1 B x2 1/9/2013 4 0 4 0 0
# 3 1 A x2 1/9/2013 4 4 0 0 0
# 4 1 C x1 1/16/2013 2 0 0 2 0
# 5 1 B x1 1/16/2013 2 0 2 0 0
# 6 1 A x1 1/16/2013 4 4 0 0 0
# 7 2 A x1 1/23/2013 2 2 0 0 0
# 8 2 B x2 1/30/2013 2 0 2 0 0
# 9 2 C x1 2/6/2013 2 0 0 2 0
# 10 2 D x3 2/13/2013 4 0 0 0 4