-1

我有一个类别字符串如下:

categoryVector <- c("1_100_1_2_3")

我也有每个类别对应的时间:

timeVector <- c("2013-03-07 05:16:50,617_2013-03-07 05:19:24,984_2013-03-07 05:21:06,002_2013-03-07 05:21:06,833_2013-03-07 05:21:10,713")  

我想计算在第 1 类和第 2 类上花费的时间

Time spent in category 1: (Time in 100 - Time in 1) + (Time on 2 - Time on 1)
Time spent in category 2: Time on 3 - Time on 2

我需要对 200K+ 记录重复这些计算。在 R 中有没有一种有效的方法来做到这一点?

4

1 回答 1

0
 inp <- read.table(text=gsub("_", "\n", timeVector), sep=",")
 inp$V1 <- as.POSIXct(inp$V1)
 inp2 <- read.table(text=gsub("_", "\n", categoryVector))

inp$diffs <- c( difftime(inp$V1[-1], inp$V1[-nrow(inp)]), NA)
inp <- cbind(inp,inp2)
                   V1  V2 diffs  V1
1 2013-03-07 05:16:50 617   154   1
2 2013-03-07 05:19:24 984   102 100
3 2013-03-07 05:21:06   2     0   1
4 2013-03-07 05:21:06 833     4   2
5 2013-03-07 05:21:10 713    NA   3
# should probably rename those columns
 tapply(inp$diffs, inp[,4], sum, na.rm=TRUE)
#  1   2   3 100 
#154   4   0 102 
于 2013-04-18T18:32:32.920 回答