0

我这个时间序列数据(我无法 dput。因为 dput 给出了一个很长的结果,在这里复制和粘贴会弄得一团糟)这些是我的文本文件中的几行:

        Time                  Depths   ifc  if   lat        lon
"7814" "2012-03-15 04:45:09-05" 4816 1040 5410 43.213628 -92.27727
"7815" "2012-03-15 04:30:04-05" 4813 1040 5410 43.213628 -92.27727
"7816" "2012-03-15 04:15:14-05" 4807 1040 5410 43.213628 -92.27727
"7817" "2012-03-15 04:00:09-05" 4809 1040 5410 43.213628 -92.27727
"7818" "2012-03-15 03:45:04-05" 4819 1040 5410 43.213628 -92.27727
"7819" "2012-03-15 03:30:15-05" 4816 1040 5410 43.213628 -92.27727
"7820" "2012-02-25 14:45:07-06" 4862 1040 5410 43.213628 -92.27727
"7821" "2012-02-25 14:30:02-06" 4858 1040 5410 43.213628 -92.27727
"7822" "2012-02-25 14:15:13-06" 4852 1040 5410 43.213628 -92.27727
"7823" "2012-02-25 14:00:08-06" 4860 1040 5410 43.213628 -92.27727
"7824" "2012-02-25 13:45:03-06" 4855 1040 5410 43.213628 -92.27727
"7825" "2012-02-25 13:30:13-06" 4869 1040 5410 43.213628 -92.27727
"7826" "2012-02-25 13:15:08-06" 4868 1040 5410 43.213628 -92.27727
"7827" "2012-02-25 13:00:03-06" 4873 1040 5410 43.213628 -92.27727

在这里,您可以看到行号旁边的值。7819是一个跳跃。我希望解决此问题,使其包含 15 分钟的连续时间间隔,并且这些间隔中的深度列为 NA,其余列中的值与其他行中的常量值一样填充。

我在 SO 上试过这个,但没有用。有人可以帮我吗?

4

2 回答 2

1

@Jase_ 提供了一个示例,说明如果您的数据是data.frame; 但是,您在评论中以及在最初尝试使用dputthis 时指出,这是zoo该类的对象。

这是zoo解决方案(从概念上借用 Jase_ 的答案)。它利用对象的index属性zoo

# First, read in your data as a zoo object via "copy and paste"
require(zoo)
t = read.zoo(text='        Time Depths   ifc  if   lat        lon
"7814" "2012-03-15 04:45:09-05" 4816 1040 5410 43.213628 -92.27727
"7815" "2012-03-15 04:30:04-05" 4813 1040 5410 43.213628 -92.27727
"7816" "2012-03-15 04:15:14-05" 4807 1040 5410 43.213628 -92.27727
"7817" "2012-03-15 04:00:09-05" 4809 1040 5410 43.213628 -92.27727
"7818" "2012-03-15 03:45:04-05" 4819 1040 5410 43.213628 -92.27727
"7819" "2012-03-15 03:30:15-05" 4816 1040 5410 43.213628 -92.27727
"7820" "2012-02-25 14:45:07-06" 4862 1040 5410 43.213628 -92.27727
"7821" "2012-02-25 14:30:02-06" 4858 1040 5410 43.213628 -92.27727
"7822" "2012-02-25 14:15:13-06" 4852 1040 5410 43.213628 -92.27727
"7823" "2012-02-25 14:00:08-06" 4860 1040 5410 43.213628 -92.27727
"7824" "2012-02-25 13:45:03-06" 4855 1040 5410 43.213628 -92.27727
"7825" "2012-02-25 13:30:13-06" 4869 1040 5410 43.213628 -92.27727
"7826" "2012-02-25 13:15:08-06" 4868 1040 5410 43.213628 -92.27727
"7827" "2012-02-25 13:00:03-06" 4873 1040 5410 43.213628 -92.27727
', tz="")

合并输出只有两行:

# Modify your index as per Jase_'s answer
index(t) = index(t) - as.numeric(format(index(t), "%S"))
# Merge with an empty zoo object that has an index
# of all the dates that you need.
t.merged = merge(t, zoo(, seq(from=index(t)[1], 
                              to=index(t)[length(index(t))], 
                              by="15 min")))

让我们看看输出是什么样子的:

head(t.merged, 10L)
# 
# 2012-02-25 13:00:00 4873 1040 5410 43.21363 -92.27727
# 2012-02-25 13:15:00 4868 1040 5410 43.21363 -92.27727
# 2012-02-25 13:30:00 4869 1040 5410 43.21363 -92.27727
# 2012-02-25 13:45:00 4855 1040 5410 43.21363 -92.27727
# 2012-02-25 14:00:00 4860 1040 5410 43.21363 -92.27727
# 2012-02-25 14:15:00 4852 1040 5410 43.21363 -92.27727
# 2012-02-25 14:30:00 4858 1040 5410 43.21363 -92.27727
# 2012-02-25 14:45:00 4862 1040 5410 43.21363 -92.27727
# 2012-02-25 15:00:00   NA   NA   NA       NA        NA
# 2012-02-25 15:15:00   NA   NA   NA       NA        NA
tail(t.merged, 10L)
# 
# 2012-03-15 02:30:00   NA   NA   NA       NA        NA
# 2012-03-15 02:45:00   NA   NA   NA       NA        NA
# 2012-03-15 03:00:00   NA   NA   NA       NA        NA
# 2012-03-15 03:15:00   NA   NA   NA       NA        NA
# 2012-03-15 03:30:00 4816 1040 5410 43.21363 -92.27727
# 2012-03-15 03:45:00 4819 1040 5410 43.21363 -92.27727
# 2012-03-15 04:00:00 4809 1040 5410 43.21363 -92.27727
# 2012-03-15 04:15:00 4807 1040 5410 43.21363 -92.27727
# 2012-03-15 04:30:00 4813 1040 5410 43.21363 -92.27727
# 2012-03-15 04:45:00 4816 1040 5410 43.21363 -92.27727

但是,这不会NA像您想要的那样替换值。

于 2012-07-06T11:02:22.657 回答
1

如果我正确理解了您的问题,则应该这样做(假设它不是zoo对象并且data.frameis t)。您可能还必须将列索引 2 更改为 1

dates <- as.POSIXct(t[,2])
# remove the seconds from the time stamps
dates <- dates - as.numeric(format(dates,"%S"))
# Create a sequence of dates for the entire time.
all_dates <- seq.POSIXt(from=dates[length(dates)], to=dates[1],by="15 min")
# Put them into a data.frame to make merging easier
all_dates_frame <- data.frame(dates_floor=all_dates)
# create a data.frame of the obsereved values with the floored dates
t_floor <- data.frame(dates_floor=dates, t[,-2]) 
# merge the observations onto the grid
together <- merge(all_dates_frame, t_floor, all.y="TRUE", all.x="TRUE")
# If you want to replace the floored times with the actual times then find which ones need replacing
replace_time_index <- !is.na(together[,2])
# replace the time stamps
together[replace_time_index, 1] <- t[,2] 
于 2012-07-06T07:41:27.247 回答