r - R/zoo：“order.by”中的索引条目不是唯一的

Question

我有一个 .csv 文件，其中包含 4 列数据，对应一列日期/时间，间隔为一分钟。缺少一些时间戳，所以我试图生成缺少的日期/时间并在 Y 列中为它们分配 NA 值。我以前使用其他格式完全相同的 .csv 文件完成了此操作，没有任何问题。代码是：

# read the csv file
har10 = read.csv(fpath, header=TRUE);

# set date
har10$HAR.TS<-as.POSIXct(har10$HAR.TS,format="%y/%m/%d %H:%M")

# convert to zoo
df1.zoo<-zoo(har10[,-1],har10[,1]) #set date to Index

# merge and generate NAs
df2 <- merge(df1.zoo,zoo(,seq(start(df1.zoo),end(df1.zoo),by="min")), all=TRUE)

# write zoo object to .csv file in Home directory
write.zoo(df2, file = "har10fixed.csv", sep = ",")

转换为 POSIXct 后，我的数据看起来像这样（一整年，或多或少），这似乎很好：

                    HAR.TS        C1       C2         C3        C4
1      2010-01-01 00:00:00 -4390.659 5042.423 -2241.6344 -2368.762
2      2010-01-01 00:01:00 -4391.711 5042.056 -2241.1796 -2366.725
3      2010-01-01 00:02:00 -4390.354 5043.003 -2242.5493 -2368.786
4      2010-01-01 00:03:00 -4390.337 5038.570 -2242.7653 -2371.289

当我“转换为动物园”步骤时，我收到以下错误：

 Warning message:
 In zoo(har10[, -1], har10[, 1]) :
   some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique

我检查了重复的条目，但没有得到任何结果：

> anyDuplicated(har10)
[1] 0

有任何想法吗？我不知道为什么我在这个文件上收到这个错误，但它对以前的文件有效。谢谢！

编辑：可复制的形式：

编辑2：必须删除数据/代码，对不起！

score 11 · Accepted Answer

anyDuplicated(har10)告诉您是否有任何完整的行重复。zoo 警告索引，所以你应该运行anyDuplicated(har10$HAR.TS). sum(duplicated(har10$HAR.TS))将显示有近 9,000 个重复的日期时间。第一个副本在第 311811 行附近，10/08/19 13:10出现两次。

score 2 · Accepted Answer

并处理重复的索引（参见?zoo和?aggregate.zoo）

## zoo series with duplicated indexes
z3 <- zoo(1:8, c(1, 2, 2, 2, 3, 4, 5, 5))
plot(z3)

## remove duplicated indexes by averaging
lines(aggregate(z3, index, mean), col = 2, type = "o")

## or by using the last observation
lines(aggregate(z3, index, tail, 1), col = 4)

r - R/zoo：“order.by”中的索引条目不是唯一的

2 回答 2

Related

Reference