-4

假设我有一个 xts 对象或数据框或 csv 格式的数据,并希望通过摆脱非流动天数或观测值小于固定大小的天数来过滤数据(比如每天少于 5k 次观测)? 我正在逐个逐笔交易数据。

根据社区的要求发布数据。我想过滤 4 月 5 日,因为它有少于 3 个观察值:

"DateTime","spy.prices.Open","spy.prices.High","spy.prices.Low","spy.prices.Close"
2007-04-02 09:34:59,142.16,142.34,142.13,142.2
2007-04-02 09:39:59,142.19,142.32,142.14,142.16
2007-04-02 09:44:58,142.16,142.27,142.03,142.25
2007-04-02 09:49:59,142.26,142.28,142.16,142.18
2007-04-02 09:54:57,142.17,142.24,142.15,142.2
2007-04-02 09:59:57,142.2,142.23,142.09,142.13

2007-04-05 14:19:57,144.3,144.34,144.29,144.33
2007-04-05 14:24:59,144.33,144.43,144.31,144.42

2007-04-10 14:34:58,144.64,144.71,144.59,144.62
2007-04-10 14:39:56,144.62,144.69,144.62,144.67
2007-04-10 14:44:59,144.67,144.72,144.67,144.71
2007-04-10 14:49:59,144.7,144.73,144.66,144.73
2007-04-10 14:54:59,144.73,144.75,144.69,144.7
2007-04-10 14:59:58,144.701,144.72,144.7,144.71
2007-04-10 15:04:58,144.72,144.78,144.71,144.74
2007-04-10 15:09:58,144.7499,144.79,144.74,144.77
2007-04-10 15:14:59,144.77,144.7799,144.69,144.69
2007-04-10 15:19:57,144.69,144.73,144.66,144.719
2007-04-10 15:24:59,144.71,144.79,144.71,144.79
2007-04-10 15:29:59,144.79,144.79,144.72,144.725
2007-04-10 15:34:59,144.73,144.79,144.73,144.78
2007-04-10 15:39:57,144.78,144.83,144.76,144.77
2007-04-10 15:44:59,144.78,144.81,144.73,144.77
2007-04-10 15:49:59,144.78,144.78,144.73,144.74
2007-04-10 15:54:57,144.74,144.8,144.73,144.79
2007-04-10 15:59:59,144.79,144.82,144.79,144.8
4

1 回答 1

6

丢弃数据似乎很愚蠢......但是你去:

library(xts)
x <- as.xts(read.zoo("data.csv",sep=",",header=TRUE,FUN=as.POSIXct))
x <- merge(x,N=apply.daily(x,nrow),fill=function(f) na.locf(f,fromLast=TRUE))
x <- x[x$N > 2,]
于 2013-05-31T20:24:41.020 回答