我想在新开发的疾病的数据集中识别这些 ID。该数据集采用日记的形式,人们每天在日记中回答关于他们是否患有这种疾病的“是/否”问题。
ID <- c(1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3)
Date <- c("2020-03-10","2020-03-11","2020-03-12","2020-03-13","2020-03-14","2020-03-12","2020-03-13","2020-03-14","2020-03-15","2020-03-16","2020-03-17","2020-03-18", "2020-03-12","2020-03-13","2020-03-14","2020-03-15","2020-03-16","2020-03-17","2020-03-18","2020-03-19","2020-03-20")
Disease <- c("No","No","Yes","Yes","Yes","No","No","No", "Yes","Yes","Yes","No","Yes","Yes","No","No","No","Yes","Yes","Yes","Yes")
df <- data.frame(ID, Date, Disease)
df
ID Date Disease
1 2020-03-10 No
1 2020-03-11 No
1 2020-03-12 Yes
1 2020-03-13 Yes
1 2020-03-14 Yes
2 2020-03-12 No
2 2020-03-13 No
2 2020-03-14 No
2 2020-03-15 Yes
2 2020-03-16 Yes
2 2020-03-17 Yes
2 2020-03-18 No
3 2020-03-12 Yes
3 2020-03-13 Yes
3 2020-03-14 No
3 2020-03-15 No
3 2020-03-16 No
3 2020-03-17 Yes
3 2020-03-18 Yes
3 2020-03-19 Yes
3 2020-03-20 Yes
但是,要被定性为“新发疾病”,该人必须满足以下条件: 1. 该人必须至少连续两天“是” 2. 该人必须回答“否” ” 在第一个“是”之前至少连续 3 天。
作为输出,我希望有多少人满足这些条件。所以在上面数据集的提取中,这将是两个(ID 2+3)。
有谁知道如何实现这一目标?在此先感谢您的时间!