为什么我在 cor() 和 ccf() 之间得到不同的相关结果?
library(xts)
> set.seed(123)
> ts1 = xts(1:100, as.POSIXlt(1366039619, tz="", origin="1970-01-01") + rnorm(100, 0, 3))
> ts2 = xts(1:100, as.POSIXlt(1366039619, tz="", origin="1970-01-01") + rnorm(100, 0, 3))
> as.vector(ccf(as.integer(ts1[,1]), as.integer(ts2[,1]), lag.max =10, plot =F, na.action=na.pass)$acf)
[1] -0.13747975 -0.00747975 -0.09497750 -0.01031203 -0.07564956 0.19881488 -0.11353135 0.01673867 0.12900690 0.00059706 -0.09642964 0.20852985 0.02476448 0.00126913 -0.03467147 -0.04284728 -0.05561356
[18] 0.08875188 0.01587159 -0.04449745 0.01002100
> sapply(seq(-10, 10), function(x, ts1, ts2) { cor(ts1[,1], lag(ts2[,1], x), use="complete.obs") }, ts1, ts2)
[1] -0.154055651 -0.008411318 -0.104222576 -0.011595184 -0.082495425 0.210464976 -0.118454928 0.018112365 0.132716811 0.000694595 -0.096429643 0.209312640 0.025156993 0.001450175 -0.035451383
[16] -0.043902825 -0.057842616 0.093863686 0.017485161 -0.047042779 0.011511559
> sapply(seq(-10, 10), function(x, ts1, ts2) { cor(ts1[,1], lag(ts2[,1], x), use="complete.obs") }, ts1, ts2) - as.vector(ccf(as.integer(ts1[,1]), as.integer(ts2[,1]), lag.max =10, plot =F, na.action=na.pass)$acf)
[1] -0.0165759032546357876203 -0.0009315701778466996610 -0.0092450780124607306876 -0.0012831523310935632337 -0.0068458595845764941279 0.0116500945970494651505 -0.0049235745757881255180
[8] 0.0013736907995123247284 0.0037099107611970050247 0.0000975349354166987759 -0.0000000000000000277556 0.0007827869094209904954 0.0003925162566637135919 0.0001810479989895477041
[15] -0.0007799161627975795263 -0.0010555407353524254299 -0.0022290547145371181204 0.0051118107350296843050 0.0016135741880074876142 -0.0025453295798825298357 0.0014905566679348520448
更新
由于 ccf() 使用 acf(),因此差异可以简化为:
> as.vector(acf(c(42, 5, 65437, 23), plot=F, lag.max=1)$acf)
[1] 1.000000 -0.416954
> cor(c(42, 5, 65437, 23), c(NA, 42, 5, 65437), use="pairwise.complete.obs")
[1] -0.500218
> cor(c(42, 5, 65437, 23), c(5, 65437, 23, NA), use="pairwise.complete.obs")
[1] -0.500218