I have performed counts of events (in Group 1) over a time period for each group (in Group 2). I am looking to spread Group 1 events into separate columns, and using Group 2 and timestamp as rows. Each cell will contain the counts of events over a time period (Present date to the previous 4 days).
See the example below, for each of the Group 2 (I & II) I counted Events A and L in Group 1 happened within 4 days.
dates = as.Date(c("2011-10-09",
"2011-10-15",
"2011-10-16",
"2011-10-18",
"2011-10-21",
"2011-10-22",
"2011-10-24"))
group1=c("A",
"A",
"A",
"A",
"L",
"L",
"A")
group2=c("I",
"I",
"I",
"I",
"I",
"I",
"II")
df1 <- data.frame(dates, group1, group2)
Using dplyr pipes I managed to produce the following table (also see Count event types over time series by multiple conditions)
df1 %>%
group_by(group1, group2) %>%
mutate(count = sapply(dates
, function(x){
sum(dates <= x & dates > (x-4))
}))
dates group1 group2 count
<date> <fctr> <fctr> <int>
1 2011-10-09 A I 1
2 2011-10-15 A I 1
3 2011-10-16 A I 2
4 2011-10-18 A I 3
5 2011-10-21 L I 1
6 2011-10-22 L I 2
7 2011-10-24 A II 1
Eventually, I want to obtain a table similar to this, with Events A & L counts update according to dates (time period = current date - 4 days) in both I & II (Group 2).
dates group1 group2 count (A) count (L)
1 2011-10-09 A I 1 0
2 2011-10-15 A I 1 0
3 2011-10-16 A I 2 0
4 2011-10-18 A I 3 0
5 2011-10-21 L I 0 1
6 2011-10-22 L I 0 2
7 2011-10-24 A II 1 0
In a larger dataset, not all events in Group 1 appears in every Group 2. How can I update these empty cells so that it will either 1) carry forward the count from the previous row or 2) update the count based on the updated timestamp/ time period?
Thanks!