我正在尝试根据现有行的重叠时间段创建新行。例如,我想把这个:
Customer_Product <- data.table(Customer=c("A01","A01","A01", "A02", "A02", "A02", "A03", "A03", "A03"),
Product=c("Prod1","Prod2","Prod3","Prod1","Prod2","Prod3","Prod1","Prod2","Prod3"),
Start_Date=c("1/1/2015", "3/1/2015", "4/1/2015", "1/1/2015", "3/1/2015", "4/1/2015", "1/1/2015", "3/1/2015", "4/1/2015"),
End_Date=c("2/1/2015","5/1/2015","5/1/2015","2/1/2015","5/1/2015","6/1/2015","2/1/2015","6/1/2015","5/1/2015"))
Customer Product Start_Date End_Date 1: A01 Prod1 1/1/2015 2/1/2015 2: A01 Prod2 3/1/2015 5/1/2015 3: A01 Prod3 4/1/2015 5/1/2015 4: A02 Prod1 1/1/2015 2/1/2015 5: A02 Prod2 3/1/2015 5/1/2015 6: A02 Prod3 4/1/2015 6/1/2015 7: A03 Prod1 1/1/2015 2/1/2015 8: A03 Prod2 3/1/2015 6/1/2015 9: A03 Prod3 4/1/2015 5/1/2015
变成这样:
Customer_Product_Combo <- data.table(Customer=c("A01","A01","A01", "A02", "A02", "A02", "A02","A03", "A03","A03","A03"),
Product_or_Combination=c("Prod1","Prod2","Prod2/Prod3","Prod1","Prod2","Prod2/Prod3","Prod3","Prod1","Prod2","Prod2/Prod3","Prod2"),
Start_Date=c("1/1/2015","3/1/2015","4/1/2015","1/1/2015","3/1/2015","4/1/2015","5/1/2015","1/1/2015","3/1/2015","4/1/2015","5/1/2015"),
End_Date=c("2/1/2015","4/1/2015","5/1/2015","2/1/2015","4/1/2015","5/1/2015","6/1/2015","2/1/2015","4/1/2015","5/1/2015","6/1/2015"))
Customer Product_or_Combination Start_Date End_Date 1: A01 Prod1 1/1/2015 2/1/2015 2: A01 Prod2 3/1/2015 4/1/2015 3: A01 Prod2/Prod3 4/1/2015 5/1/2015 4: A02 Prod1 1/1/2015 2/1/2015 5: A02 Prod2 3/1/2015 4/1/2015 6: A02 Prod2/Prod3 4/1/2015 5/1/2015 7: A02 Prod3 5/1/2015 6/1/2015 8: A03 Prod1 1/1/2015 2/1/2015 9: A03 Prod2 3/1/2015 4/1/2015 10: A03 Prod2/Prod3 4/1/2015 5/1/2015 11: A03 Prod2 5/1/2015 6/1/2015
我一直在研究 IRange,因为似乎 disjoin() 可能是一个可能的解决方案,但我看不到任何继承/合并“Prod”数据的方法。
我也一直在尝试使用 dplyr 中的领先/滞后和聚集/合并周期来勾勒出一些东西,但也值得注意的是,我可能有超过 2 个“Prod”重叠的实例,然后逻辑就得到了凌乱。
有没有合理的方法来做到这一点?任何帮助是极大的赞赏!