1

This is the data frame

Date        ID cost Value
15/12/2016   1  yes    200
15/10/2016   1  yes    100
15/9/2016    1  yes    55
15/04/2016   1  yes    1000
15/12/2016   2  yes    300
15/10/2016   2  yes    200
15/9/2016    2  yes    100
15/04/2016   2  yes    1000
15/12/2016   3  no     300
15/10/2016   3  no     200
15/9/2016    3  no     100
15/04/2016   3  no     1000

I'd like to repeat the rolling sum on 3 month for each ID that have a cost = "yes". Note that in the example the IDs are just 3 but in my DB are n.

The output should be

Date        ID  Value  Rolling_Sum
15/12/2016   1   200   355
15/10/2016   1   100   155
15/9/2016    1   55    55
15/04/2016   1   1000  1000
15/12/2016   2   300   600
15/10/2016   2   200   300
15/9/2016    2   100   100
15/04/2016   2   1000  1000

I've seen a lot of example in other questions. one of my biggest problem is that the date is not continue.. so i can have different lag between different data.

Thanks

4

1 回答 1

2

您可以为此使用 -package 中的foverlaps函数:data.table

library(data.table)
library(lubridate)

# convert the data to a 'data.table'
setDT(dt)
# convert the Date column to date-class
dt[, Date := as.Date(Date, '%d/%m/%Y')]
# create an exact same column to be used by the 'foverlaps' function
dt[, bdate := Date]
# create a reference 'data.table' with the 3 month intervals
dtc <- copy(dt)[, bdate := Date %m-% months(3)]
# set the keys for the reference data.table (needed for the 'foverlaps' function) 
setkey(dtc, ID, bdate, Date)
# create the overlaps and summarise
foverlaps(dt[cost=='yes'], dtc, type = 'within')[, .(val = sum(i.Value)), by = .(ID, Date)]

这使:

   ID       Date  val
1:  1 2016-12-15  355
2:  1 2016-10-15  155
3:  1 2016-09-15   55
4:  1 2016-04-15 1000
5:  2 2016-12-15  600
6:  2 2016-10-15  300
7:  2 2016-09-15  100
8:  2 2016-04-15 1000
于 2016-10-31T12:14:06.730 回答