3

我有一些看起来像这样的数据:

id    time
1     2013-02-04 02:20:59
1     2013-02-04 02:21:05
1     2013-02-04 02:21:24
2     2013-02-04 02:21:26
2     2013-02-04 02:22:19
2     2013-02-04 02:22:35

我想取两个时间值之间每个 id 的时间差,例如:

id 1 02:21:05-02:20:59=00:00:06. 

我怎样才能在 R 中做到这一点?

4

3 回答 3

3

您应该diff按时完成id,然后使用ifelse填充第三列

df <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L), 
        time = structure(c(1359915659, 1359915665, 1359915684, 
        1359915686, 1359915739, 1359915755), class = c("POSIXct", 
        "POSIXt"), tzone = "")), .Names = c("id", "time"), row.names = c(NA, -6L), 
        class = "data.frame")
df
##   id                time
## 1  1 2013-02-04 02:20:59
## 2  1 2013-02-04 02:21:05
## 3  1 2013-02-04 02:21:24
## 4  2 2013-02-04 02:21:26
## 5  2 2013-02-04 02:22:19
## 6  2 2013-02-04 02:22:35

## here you are checking if that result is diff in time only when diff in id is 0
df$result <- c(0, ifelse(diff(df$id) == 0, diff(df$time), 0))

df
##   id                time result
## 1  1 2013-02-04 02:20:59      0
## 2  1 2013-02-04 02:21:05      6
## 3  1 2013-02-04 02:21:24     19
## 4  2 2013-02-04 02:21:26      0
## 5  2 2013-02-04 02:22:19     53
## 6  2 2013-02-04 02:22:35     16
于 2013-03-12T05:24:26.087 回答
1

这里使用基本包的分组解决方案使用bytransform

   transform(dat, res = unlist(by(time,as.factor(id),
                            FUN=function(x)c(0,diff(x)))))

这适用于因子 id ,它是分组列的自然类型。

于 2013-03-12T06:01:40.810 回答
0

我认为这会起作用,但我对您的要求并不完全清楚......

df <- data.frame(id = rep(c(1,2), each=3), time=seq(from = as.POSIXct("2013-02-04 02:20:59"), to=as.POSIXct("2013-02-04 02:22:35"),length.out=6))

library(plyr)
df.diff <- ddply(df, .(id), summarise,
                 difference = diff(as.numeric(time)))
df.diff
# id diff
# 1  1 19.2
# 2  1 19.2
# 3  2 19.2
# 4  2 19.2
于 2013-03-12T05:24:36.117 回答