这类似于其他几个问题,但我仍然没有设法让它工作。我正在尝试将滚动功能(runMAD
来自 TTR 包)应用于一系列财务信息(交易价格和交易量),但滚动操作都应该是当日的(即不会跨入第二天,也不会从前一天开始) )。
数据存储为 xts 对象,具有两列(价格和数量)。我按天拆分此对象以按天创建 xts 对象列表,希望我可以runMAD
在每天应用该函数,并返回一个包含四列的 xts 对象列表(两个原始价格和交易量列,然后是两个新的 runMAD-price 和 runMAD-volume)。然而,它似乎lapply
只能返回一个长度等于天数的列表 - 所以像每日平均值这样的函数可以工作,但不确定如何让滚动函数工作,每天抛出多个结果。最后,我想测试每一行价格和交易量,看看它与中位数的偏差是否比其各自的滚动 MAD 大 3 倍。然后将此类实例保存在变量中(使用 xts 索引)。
sample data:
2011-01-02 09:11:00 20.20 20000
2011-01-02 10:42:00 20.40 50000
2011-01-02 12:33:00 20.90 5000
2011-01-02 12:33:00 20.90 10000
2011-01-02 13:25:00 20.90 10000
2011-01-02 15:36:00 20.90 50000
2011-01-02 15:39:00 20.90 50000
2011-01-02 16:12:00 20.90 15000
2011-01-02 16:33:00 20.90 50000
2011-01-02 17:04:00 20.90 50000
2011-01-02 17:05:00 20.90 50000
2011-01-02 18:26:00 20.90 10000
2011-01-03 09:21:00 20.20 20000
2011-01-03 11:19:00 20.40 50000
2011-01-03 17:33:00 20.90 5000
2011-01-06 12:33:00 20.90 10000
2011-01-06 13:25:00 20.90 10000
2011-01-06 15:36:00 20.90 50000
2011-01-06 15:39:00 20.90 50000
2011-01-06 16:20:00 20.90 15000
2011-01-06 16:20:00 20.90 50000
2011-01-06 17:24:00 20.90 50000
2011-01-06 17:55:00 20.90 50000
2011-01-06 18:49:00 20.90 10000
[编辑:根据 GSee 的评论,这里是一个更大的(和固定的)数据样本:]
dput(head(sample.data,200))
structure(c(23.48, 19.08, 22.43, 21.75, 19.78, 19.76, 23.11, 23.31, 19.58, 19.28, 23.68, 23.47, 22.13, 23.17, 24.53, 22.79, 20.28, 19.73, 22.05, 19.23, 22.40, 23.43, 20.40, 23.58, 19.94, 19.58, 21.70, 20.65, 21.69, 24.09, 22.21, 21.42, 19.48, 24.20, 23.37, 24.68, 21.64, 23.24, 23.59, 19.61, 21.89, 23.21, 19.47, 24.77, 19.64, 19.61, 21.07, 24.68, 20.20, 23.37, 24.47, 22.55, 19.64, 22.73, 23.94, 21.31, 19.69, 19.51, 21.27, 20.47, 23.67, 23.42, 20.56, 20.60, 22.23, 22.16, 20.78, 21.47, 24.82, 19.34, 19.08, 24.92, 24.12, 20.46, 21.67, 22.25, 19.16, 22.47, 24.00, 23.32, 21.13, 23.67, 20.35, 20.80, 23.48, 21.22, 21.51, 21.91, 24.06, 22.17, 24.23, 22.43, 24.94, 20.99, 24.93, 22.48, 21.74, 19.18, 21.14, 21.89, 20.23, 24.80, 22.81, 20.96, 24.30, 21.11, 23.69, 23.20, 20.09, 23.09, 21.56, 20.93, 22.97, 24.53, 23.96, 19.33, 24.63, 19.27, 19.57, 21.40, 24.04, 22.37, 24.95, 21.37, 24.01, 20.42, 21.82, 19.06, 23.08, 20.37, 21.28, 19.87, 21.84, 20.95, 20.89, 24.59, 19.18, 24.26, 20.64, 22.12, 20.85, 24.31, 22.55, 24.32, 19.90, 24.58, 24.13, 21.55, 21.06, 20.84, 20.68, 23.47, 19.67, 21.82, 23.31, 19.67, 19.45, 24.56, 20.06, 24.46, 19.89, 21.57, 20.21, 24.20, 20.53, 19.91, 23.67, 22.95, 21.18, 21.57, 20.01, 19.36, 20.65, 19.75, 22.12, 19.79, 21.07, 24.77, 21.46, 20.64, 21.94, 19.81, 23.58, 22.48, 21.03, 19.37, 19.46, 22.68, 24.56, 21.40, 21.94, 23.27, 21.26, 20.34, 23.39, 21.97, 22.25, 22.40, 23.44, 24.38, 5900, 2400, 3100, 1200, 5100, 5700, 3400, 4900, 3900, 6400, 2600, 2200, 2800, 6200, 1100, 2500, 3500, 5900, 6600, 5100, 1800, 4200, 4500, 1200, 6200, 1400, 2800, 2400, 1500, 900, 2900, 3200, 5800, 3500, 4500, 4500, 4500, 6800, 5000, 4900, 5300, 6100, 3400, 1500, 6200, 1500, 5400, 5800, 800, 6300, 5000, 3400, 5500, 5600, 2100, 3700, 4400, 3100, 4600, 2700, 2700, 2700, 2600, 1200, 1700, 2900, 5300, 4900, 2600, 3100, 4000, 6000, 2500, 3000, 6500, 3600, 5900, 2300, 1600, 1300, 4800, 3800, 4000, 1500, 5600, 5500, 1500, 3100, 1300, 3200, 1500, 1400, 1400, 4600, 5300, 5600, 1200, 1500, 6100, 2200, 6700, 1700, 4000, 6200, 2700, 1100, 4500, 5400, 1600, 2600, 5700, 4100, 7000, 2300, 2600, 5100, 4500, 1900, 2400, 2600, 6900, 2100, 3500, 3600, 2900, 900, 2800, 2200, 4900, 1700, 3000, 800, 2500, 6100, 3300, 6500, 2600, 6200, 900, 4000, 6600, 6400, 1000, 4800, 6900, 4000, 2800, 1200, 6200, 2700, 1300, 1200, 2800, 1300, 5900, 4000, 3900, 5300, 2400, 4700, 6900, 2900, 6200, 3300, 6900, 2200, 1800, 5500, 2000, 6400, 3500, 5300, 5000, 6700, 4500, 3200, 1700, 3300, 5000, 5800, 4900, 2100, 1200, 5200, 1000, 900, 2600, 1000, 5500, 3400, 1600, 4700, 1500, 5900, 6900, 3700, 4900, 5700, 900, 2400), class = c("xts", "zoo"), .indexCLASS = c("POSIXlt",
"POSIXt"), .indexTZ = "", tclass = c("POSIXlt", "POSIXt"), tzone = "", index = structure(c(1325584080, 1325594940, 1325594940, 1325604600, 1325759100, 1325762520, 1325762520, 1325769300, 1325769300, 1325848080, 1325864880, 1326128220, 1326196500, 1326196500, 1326196500, 1326196500, 1326196500, 1326196500, 1326209700, 1326279480, 1326283620, 1326288300, 1326288300, 1326289680, 1326289680, 1326289680, 1326292320, 1326294060, 1326294600, 1326297600, 1326387000, 1326456720, 1326467160, 1326711600, 1326723000, 1326724260, 1326809940, 1326814860, 1326885960, 1326885960, 1326889980, 1326894000, 1326895200, 1326895200, 1326898080, 1326986700, 1326987240, 1326992100, 1327072140, 1327328040, 1327328040, 1327328040, 1327417920, 1327423140, 1327424820, 1327425240, 1327483200, 1327496520, 1327570320, 1327570320, 1327575420, 1327588680, 1327588980, 1327595880, 1327595880, 1327595880, 1327664820, 1327674720, 1327680660, 1327680780, 1327680780, 1327683960, 1327914300, 1327914300, 1327915260, 1327918140, 1327924860, 1327924920, 1327924980, 1327924980, 1327927680, 1328013360, 1328014200, 1328025000, 1328025000, 1328026740, 1328089440, 1328091360, 1328091360, 1328110620, 1328111340, 1328111340, 1328112420, 1328113800, 1328193540, 1328194080, 1328194140, 1328196720, 1328274360, 1328274420, 1328278320, 1328519280, 1328520120, 1328520600, 1328520600, 1328524140, 1328527980, 1328531580, 1328540880, 1328540880, 1328547600, 1328547660, 1328547720, 1328547780, 1328607060, 1328608080, 1328618760, 1328623380, 1328623380, 1328625720, 1328631480, 1328717760, 1328717880, 1328793000, 1328797980, 1329132840, 1329210480, 1329215400, 1329215820, 1329215820, 1329219480, 1329223140, 1329300900, 1329301620, 1329315240, 1329315240, 1329388740, 1329389700, 1329390000, 1329390000, 1329390180, 1329391860, 1329391860, 1329391860, 1329402120, 1329467700, 1329467700, 1329469080, 1329469080, 1329471300, 1329477000, 1329477000, 1329489840, 1329732660, 1329732660, 1329733080, 1329745980, 1329745980, 1329750120, 1329750120, 1329750780, 1329753780, 1329753780, 1329753900, 1329754080, 1329754260, 1329755520, 1329755520, 1329755820, 1329755880, 1329755880, 1329756000, 1329815040, 1329815040, 1329826380, 1329840000, 1329841200, 1329841200, 1329842220, 1329901080, 1329901200, 1329901200, 1329901680, 1329901680, 1329902040, 1329902100, 1329902100, 1329902100, 1329902340, 1329902340, 1329902340, 1329902520, 1329902880, 1329902880, 1329906720, 1329906840, 1329908460, 1329910980, 1329911880, 1329914400), tzone = "", tclass = c("POSIXlt", "POSIXt")), .Dim = c(200L, 2L), .Dimnames = list(NULL, c("sample.dataprice", "sample.datavol")))
在同一时间段内经常有多笔交易,在这种情况下,runMAD
如果还没有,我希望单独查看每笔交易。此外,我意识到 的窗口runMAD
可能大于特定日期的交易数量,在这种情况下,我想忽略当天的(NA) - 或者如果可能的话,将runMAD
窗口大小动态减小为交易数量在那一天。
我的数据的str:
str(样本。数据)
An ‘xts’ object on 2012-01-03 09:11:00/2012-03-30 18:49:00 containing:
Data: num [1:459, 1:2] 22.6 22.5 22.5 22.5 22.8 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "sample.dataprice" "sample.datavol"
Indexed by objects of class: [POSIXlt,POSIXt] TZ:
xts Attributes:
NULL
我split
这几天然后runMAD
:
sample.data.days <- split.xts(sample.data, f="days")
require(TTR)
runMAD(sample.data.days, n=6, stat="median", cumulative=FALSE, constant=1)
*Error in runMedian(x, n, cumulative = cumulative) : (list) object cannot be coerced to type 'double'*
为了避免应用于runMAD
观察次数少于窗口长度的日子,我只保留观察次数充足的日子:
sample.data.days6<- sample.data.days[vapply(sample.data.days, FUN=nrow, FUN.VALUE=0L)>6]
我也试过这个:
do.call(runMAD, lapply(as.numeric(index(sample.data.days6)), function(x) runMAD(sample.data.days6, constant=1)))
但它会引发与上述相同的错误(runMedian(x, n,累积=累积)中的错误:(列表)对象不能被强制输入'double')。
我没有将数据放入列表中以分解为天,而是尝试了以下方法:
apply.daily(sample.data, function(d){
c(runMAD_price = runMAD(d$sample.data[, 1], n=6),
runMAD_volume = runMAD(d$sample.data[, 2], n=6),
)
})
*Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), : 'data' must be of a vector type
这个问题的一个细微变化是看看如何对移动时间窗口(例如 1 小时)而不是移动交易窗口进行同样的处理。
无论窗口是基于时间还是基于观察,我都想最终创建一个 xts 对象,其中包含所有观察值(价格或交易量)与某个窗口中位数的偏差除以 MAD(在中间-窗口),高于某个阈值(例如 3)。由于当一半的观察值相同时,MAD 将为 0,我想用任何先前的 MAD>0 替换任何 0。