假设我有一个单位数据集,这些单位可以随时间将活动状态从活动状态更改为非活动状态。我想记录每次单位更改活动时从活动到非活动的切换。一个可重现的例子:
UNIT <- c(100,100, 200, 200, 200, 200, 200, 300, 300, 300,300)
STATUS <- c('ACTIVE','INACTIVE','ACTIVE','ACTIVE','INACTIVE','ACTIVE','INACTIVE','ACTIVE','ACTIVE',
'ACTIVE','INACTIVE')
TERMINATED <- c('1999-07-06' , '2008-12-05' , '2000-08-18' , '2000-08-18' ,'2000-08-18' ,'2008-08-18',
'2008-08-18','2006-09-19','2006-09-19' ,'2006-09-19' ,'1999-03-15')
START <- c('2007-04-23','2008-12-06','2004-06-01','2007-02-01','2008-04-19','2010-11-29','2010-12-30',
'2007-10-29','2008-02-05','2008-06-30','2009-02-07')
STOP <- c('2008-12-05','4712-12-31','2007-01-31','2008-04-18','2010-11-28','2010-12-29','4712-12-31',
'2008-02-04','2008-06-29','2009-02-06','4712-12-31')
DAT <- data.frame(UNIT,STATUS,TERMINATED,START,STOP)
DAT
UNIT STATUS TERMINATED START STOP
1 100 ACTIVE 1999-07-06 2007-04-23 2008-12-05
2 100 INACTIVE 2008-12-05 2008-12-06 4712-12-31
3 200 ACTIVE 2000-08-18 2004-06-01 2007-01-31
4 200 ACTIVE 2000-08-18 2007-02-01 2008-04-18
5 200 INACTIVE 2000-08-18 2008-04-19 2010-11-28
6 200 ACTIVE 2008-08-18 2010-11-29 2010-12-29
7 200 INACTIVE 2008-08-18 2010-12-30 4712-12-31
8 300 ACTIVE 2006-09-19 2007-10-29 2008-02-04
9 300 ACTIVE 2006-09-19 2008-02-05 2008-06-29
10 300 ACTIVE 2006-09-19 2008-06-30 2009-02-06
11 300 INACTIVE 1999-03-15 2009-02-07 4712-12-31
当一个单元的状态从 ACTIVE 变为 INACTIVE 时,这意味着该单元已被终止。不幸的是,记录的终止日期 (TERMINATED) 无效。有效的终止日期是从活动切换到非活动后的有效开始日期(当 STATUS == INACTIVE 时)减去 1 天。换句话说,先前活动记录的结束日期。例如,在单元 100 的情况下,第 3 行中的 TERMINATED 日期是正确的。然而,单元 300 的终止日期应为“2009-02-06”。该解决方案应该足够健壮,以便它了解单元 200 具有两个不活动状态并相应地进行填充。
我什至不知道在 R 中从哪里开始这样的事情
最终结果应如下所示:
UNIT STATUS TERMINATED START STOP
1 100 ACTIVE 2008-12-05 2007-04-23 2008-12-05
2 100 INACTIVE 2008-12-05 2008-12-06 4712-12-31
3 200 ACTIVE 2008-04-18 2004-06-01 2007-01-31
4 200 ACTIVE 2008-04-18 2007-02-01 2008-04-18
5 200 INACTIVE 2008-04-18 2008-04-19 2010-11-28
6 200 ACTIVE 2010-12-29 2010-11-29 2010-12-29
7 200 INACTIVE 2010-12-29 2010-12-30 4712-12-31
8 300 ACTIVE 2009-02-06 2007-10-29 2008-02-04
9 300 ACTIVE 2009-02-06 2008-02-05 2008-06-29
10 300 ACTIVE 2009-02-06 2008-06-30 2009-02-06
11 300 INACTIVE 2009-02-06 2009-02-07 4712-12-31