I am looking at a data set of Emergency Room visits. I only want to keep visits per ID that are 30 days apart. So as an example say I have this below.
If I start with ID=1:
- In Row 1 I can see that the lag between row 1 and 2 is 15 days so I will exclude, or for now flag, row 2.
- Then I will continue to use Row 1 to evaluate Row 3. Again this is only 17 days so I will exclude Row 3 and look at Row 4.
- Row 4 is 30 days away so I keep it and then use Row 4 to evaluate Row 5....and so on.
I have been trying to do this with the lag function but I can't figure out how to utilize the lag when I have to continue to use the 'anchor' row to evaluate several rows.
Top is what I have and bottom is what I want. Any ideas?
I am using AZURE data studio.
HAVE
Row# ID DATE
1 1 1/1/2020
2 1 1/15/2020
3 1 1/17/2020
4 1 2/4/2020
5 1 3/15/2020
6 2 1/15/2020
7 2 3/15/2020
8 2 3/18/2020
WANT
Row# ID DATE
1 1 1/1/2020
4 1 2/4/2020
5 1 3/15/2020
6 2 1/15/2020
7 2 3/15/2020