1

我有一个这样的数据集:

> dput(data)
structure(list(Run = c("Dur 2", "Dur 3", "Dur 4", "Dur 5", "Dur 7", 
"Dur 8", "Dur 9"), reference = c("00h 00m 32s", "00h 00m 31s", 
"00h 05m 46s", "00h 03m 51s", "00h 06m 49s", "00h 06m 47s", "00h 08m 56s"
), test30 = c("00h 00m 44s", "00h 00m 41s", "00h 21m 54s", "00h 13m 37s", 
"00h 28m 48s", "00h 22m 54s", "10h 02m 12s"), test31 = c("00h 00m 39s", 
"00h 00m 45s", "00h 40m 10s", "00h 23m 07s", "00h 35m 23s", "00h 47m 42s", 
"25h 37m 05s"), test32 = c("00h 01m 05s", "00h 01m 13s", "00h 55m 02s", 
"00h 28m 54s", "01h 03m 17s", "01h 02m 08s", "39h 04m 39s")), .Names = c("Run", 
"reference", "test30", "test31", "test32"), class = "data.frame", row.names = c(NA, 
-7L))

我试图将它变成可绘制的格式,如下所示:

library(reshape2)
library(scales)

# melt the data and convert the time strings to POSIXct format
data_melted <- melt(data, id.var = "Run")
data_melted$value <- as.POSIXct(data_melted$value, format = "%Hh %Mm %Ss")

大概是由于 POSOXct 期望 24 小时时间意义上的实际 HMS 数据,我NA在最后的持续时间内得到了 s 。Dur9

处理这种不会滚动到几天的记录数据的推荐方法是H > 24什么?

我是否需要手动检查此类实例并创建一个表示日期的新字符串(这似乎需要我创建一个任意的开始日期并增加日期 if H > 24)?或者是否有更适合严格时间数据的软件包而不是假设所有时间数据都根据实际时间戳记录?

非常感谢!

4

1 回答 1

2

您可以colsplitplyr包中使用创建小时、分钟和秒的列,然后使用创建difftime可以添加到日期的对象

library(plyr)

# note gsub('s','',mdd[['value']]) removes trailing s from each value
# we then split on `[hm]` (ie. h or m)` -- this returns a data.frame with
# 3 integer columns 
times <- colsplit(gsub('s','',mdd[['value']]), '[hm]', names = c('h','m','s'))

seconds <- as.difftime(with(times, h*60*60 + m *60 + s), format = '%X', units = 'secs')
seconds
Time differences in secs
 [1]     32     31    346    231    409    407    536     44     41   1314    817   1728   1374  36132     39     45
[17]   2410   1387   2123   2862  92225     65     73   3302   1734   3797   3728 140679

你不需要自己做算术,使用MapandReduce

 Reduce('+',Map(as.difftime, times, units = c('hours','mins','secs')))
于 2013-04-18T05:47:58.867 回答