1

我有以下数据框:

id<-c(1,1,1,1,1,3,3,3,3)
period<-c("calib","calib","calib","valid","valid","calib","calib","calib","valid")
date<-c("11-11-07","11-11-07","23-11-07","12-12-08","17-12-08","11-11-07","23-11-07","23-11-07","16-01-08")
time<-c(12,13,14,11,23,15,12,18,14)
df<-data.frame(id,period,time,date)
df$date2<-as.Date(as.character(df$date), format = "%d-%m-%y")


id period time     date      date2
 1  calib   12 11-11-07 2007-11-11
 1  calib   13 11-11-07 2007-11-11
 1  calib   14 23-11-07 2007-11-23
 1  valid   11 12-12-08 2008-12-12
 1  valid   23 17-12-08 2008-12-17
 3  calib   15 11-11-07 2007-11-11
 3  calib   12 23-11-07 2007-11-23
 3  calib   18 23-11-07 2007-11-23
 3  valid   14 16-01-08 2008-01-16

我需要为每个 提取该期间date的最后一笔交易,并将其放在一个新列中。如果一天内进行了两笔交易(类似),则应根据交易时间选择最后一笔交易。我正在寻找的决赛桌如下:calibiddate

id period time     date      date2  last
 1  calib   12 11-11-07 2007-11-11   NA
 1  calib   13 11-11-07 2007-11-11   NA
 1  calib   14 23-11-07 2007-11-23 2007-11-23
 1  valid   11 12-12-08 2008-12-12   NA
 1  valid   23 17-12-08 2008-12-17   NA 
 3  calib   15 11-11-07 2007-11-11   NA
 3  calib   12 23-11-07 2007-11-23   NA
 3  calib   18 23-11-07 2007-11-23 2007-11-23
 3  valid   14 16-01-08 2008-01-16   NA

任何人都可以帮我解决这个问题吗?

4

1 回答 1

1

这就是我解决问题的方法rle

L1 <- lapply(split(df, df[, "id"]), function(dat){
    dat[, "last"] <- as.Date(NA)
    x <- rle(as.character(dat[, "period"]))
    z <- cumsum(x[["lengths"]])
    dat$last[z[x[["values"]] == "calib"]] <- dat[z[x[["values"]] == "calib"] , 
        "date2"]
    dat
})

data.frame(do.call(rbind, L1), row.names = NULL)
于 2012-08-27T03:04:04.620 回答