2

我刚刚发现汇总数据表(我对 r 和 data.table 比较陌生)不会产生排序的数据表,也不会产生连接:

起始数据表:

> DailyDataICount
                     ID week WeekDay      Qty
 1:               33739 1746       2     9369
 2:               33739 1746       3     9981
 3:               33739 1746       4    10109
 4:               33739 1746       5     9012
 5:               33739 1746       6    10387
 6:               33739 1746       7        2
 7:               33739 1747       1     7264
 8:               33739 1747       2     9195
 9:               33739 1747       3     8865
10:               33739 1747       4     8613
11:               33739 1747       5     9154
12:               33739 1747       6     9754
13:               33739 1747       7    11565
14:               33739 1748       1     9210
15:               33739 1748       2     8579
16:               33739 1748       3     8406
17:               33739 1748       4     8161
18:               33739 1748       5     8778
19:               33739 1748       6     9884
20:               33739 1748       7    10365

总结:

> MeanCount <- DailyDataICount[,mean(Qty),by="ID,WeekDay"]
> setnames(MeanCount,"V1","ObjQty")
> MeanCount
                    ID WeekDay    ObjQty
1:               33739       2  9047.667
2:               33739       3  9084.000
3:               33739       4  8961.000
4:               33739       5  8981.333
5:               33739       6 10008.333
6:               33739       7  7310.667
7:               33739       1  8237.000

加入:

> Count_SM_FM
                     ID WeekDay ItemType  QtyType
 1:               33739       2       FM      995
 2:               33739       2       SM    15241
 3:               33739       3       FM     1036
 4:               33739       3       SM    15387
 5:               33739       4       FM     1002
 6:               33739       4       SM    15449
 7:               33739       5       FM      990
 8:               33739       5       SM    15388
 9:               33739       6       FM     1344
10:               33739       6       SM    16016
11:               33739       7       SM    13840
12:               33739       1       FM     1015
13:               33739       1       SM    11456
14:               33739       7       FM     1710
> setkey(Count_SM_FM,ID,WeekDay)
> Count_SM_FM <- Count_SM_FM[MeanCount]
> 
> Count_SM_FM
                     ID WeekDay ItemType  QtyType    ObjQty
 1:               33739       2       FM      995  9047.667
 2:               33739       2       SM    15241  9047.667
 3:               33739       3       FM     1036  9084.000
 4:               33739       3       SM    15387  9084.000
 5:               33739       4       FM     1002  8961.000
 6:               33739       4       SM    15449  8961.000
 7:               33739       5       FM      990  8981.333
 8:               33739       5       SM    15388  8981.333
 9:               33739       6       FM     1344 10008.333
10:               33739       6       SM    16016 10008.333
11:               33739       7       SM    13840  7310.667
12:               33739       7       FM     1710  7310.667
13:               33739       1       FM     1015  8237.000
14:               33739       1       SM    11456  8237.000

是因为它更有效率吗?我发现了这一点,因为我必须加入Count_SM_FM不同的 data.table 但使用相同的密钥(我没有再次设置),我得到了一个错误。我将通过重新设置密钥来解决眼前的问题。我知道这是一个哲学问题,不保持顺序可能通常会节省时间,但如果在大多数情况下人们需要使用排序后的结果......

谢谢!

4

1 回答 1

3

表单连接中元素的顺序由x[i]确定i。在您的情况下,iMeanCount并且xCount_SM_FM并且i未排序。

为了得到你想要的结果,你应该这样做:

setkey(Count_SM_FM, ID, WeekDay)
setkey(MeanCount, ID, WeekDay)   ## Just add this line
ans = Count_SM_FM[MeanCount]
#        ID WeekDay ItemType QtyType    ObjQty
#  1: 33739       1       FM    1015  8237.000
#  2: 33739       1       SM   11456  8237.000
#  3: 33739       2       FM     995  9047.667
#  4: 33739       2       SM   15241  9047.667
#  5: 33739       3       FM    1036  9084.000
#  6: 33739       3       SM   15387  9084.000
#  7: 33739       4       FM    1002  8961.000
#  8: 33739       4       SM   15449  8961.000
#  9: 33739       5       FM     990  8981.333
# 10: 33739       5       SM   15388  8981.333
# 11: 33739       6       FM    1344 10008.333
# 12: 33739       6       SM   16016 10008.333
# 13: 33739       7       SM   13840  7310.667
# 14: 33739       7       FM    1710  7310.667
于 2014-06-09T14:35:41.770 回答