I have updated the question, as a) i articulated the question not clearly on the first attempt, b) my exact need also shifted somewhat.
I want to especially thank Hemmo for great help so far - and apologies for not articulating my question clearly enough to him. His code (that addressed earlier version of problem) is shown in the answer section.
At a high-level - i am looking for code that helps to identify and differentiate the different blocks of consecutive free time of different individuals. More specifically - the code would ideally:
- Check whehter an activity is labelled as "Free"
- Check whether consecutive weeks (week earlier, week later) of time spent by the same person where also labelled as "Free".
- Give the entire block of consecutive weeks of that person that are labelled "Free" an indicator in the desired outcome column. Note that the lenght of time-periods (e.g. 1 consec week, 4 consec weeks, 8 consec weeks) will vary
- Finally - due to a need for further analysis on the characteristics of these clusters, different blocks should receive different indicators. (e.g. the march block of Paul would have value 1, the May block value 2, and Kim's block in March would be have value 3)
Hopefully this becomes more clear when one looks at the example dataframe (see the desired final column)
Any help much appreciated, code for the test dataframe per below.
Many thanks in advance,
W
Example (note that the last column should be generated by the code, purely included as illustration):
Week Name Activity Hours Desired_Outcome
1 01/01/2013 Paul Free 40 1
2 08/01/2013 Paul Free 10 1
3 08/01/2013 Paul Project A 30 0
4 15/01/2013 Paul Project B 30 0
5 15/01/2013 Paul Project A 10 0
6 22/01/2013 Paul Free 40 2
7 29/01/2013 Paul Project B 40 0
8 05/02/2013 Paul Free 40 3
9 12/02/2013 Paul Free 10 3
10 19/02/2013 Paul Free 30 3
11 01/01/2013 Kim Project E 40 0
12 08/01/2013 Kim Free 40 4
13 15/01/2013 Kim Free 40 4
14 22/01/2013 Kim Project E 40 0
15 29/01/2013 Kim Free 40 5
Code for dataframe:
Name=c(rep("Paul",10),rep("Kim",5))
Week=c("01/01/2013","08/01/2013","08/01/2013","15/01/2013","15/01/2013","22/01/2013","29/01/2013","05/02/2013","12/02/2013","19/02/2013","01/01/2013","08/01/2013","15/01/2013","22/01/2013","29/01/2013")
Activity=c("Free","Free","Project A","Project B","Project A","Free","Project B","Free","Free","Free","Project E","Free","Free","Project E","Free")
Hours=c(40,10,30,30,10,40,40,40,10,30,40,40,40,40,40)
Desired_Outcome=c(1,1,0,0,0,2,0,3,3,3,0,4,4,0,5)
df=as.data.frame(cbind(Week,Name,Activity,Hours,Desired_Outcome))
df