r - 将观测值除以周期平均值。帮助简化代码

Question

数据链接： http: //dl.dropbox.com/u/56075871/data.txt

我想将每个观察值除以那个小时的平均值。例子：

2012-01-02 10:00:00     5.23
2012-01-03 10:00:00     5.28
2012-01-04 10:00:00     5.29
2012-01-05 10:00:00     5.29
2012-01-09 10:00:00     5.28
2012-01-10 10:00:00     5.33
2012-01-11 10:00:00     5.42
2012-01-12 10:00:00     5.55
2012-01-13 10:00:00     5.68
2012-01-16 10:00:00     5.53

平均为 5.388。接下来我想将每个观察值除以该平均值，所以... 5.23/5.388, 5.28/5.388, ... 直到结束 5.53/5.388

我有 10 只股票的每小时时间序列：

                        S1.1h   S2.1h     S3.1h   S4.1h  S5.1h       S6.1h    S7.1h  S8.1h      S9.1h       S10.1h
2012-01-02 10:00:00     64.00   110.7     5.23    142.0  20.75       34.12    32.53  311.9      7.82        5.31
2012-01-02 11:00:00     64.00   110.8     5.30    143.2  20.90       34.27    32.81  312.0      7.97        5.34
2012-01-02 12:00:00     64.00   111.1     5.30    142.8  20.90       34.28    32.70  312.4      7.98        5.33
2012-01-02 13:00:00     61.45   114.7     5.30    143.1  21.01       34.35    32.85  313.0      7.96        5.35
2012-01-02 14:00:00     61.45   116.2     5.26    143.7  21.10       34.60    32.99  312.9      7.95        5.36
2012-01-02 15:00:00     63.95   116.2     5.26    143.2  21.26       34.72    33.00  312.6      7.99        5.37
2012-01-02 16:00:00     63.95   117.3     5.25    143.3  21.27       35.08    33.04  312.7      7.99        5.36
2012-01-02 17:00:00     63.95   117.8     5.24    144.7  21.25       35.40    33.10  313.6      7.99        5.40
2012-01-02 18:00:00     63.95   117.9     5.23    145.0  21.20       35.50    33.17  312.5      7.98        5.35
2012-01-03 10:00:00     63.95   115.5     5.28    143.5  21.15       35.31    33.05  311.7      7.94        5.37
...

而且我想将每个观察值除以小时（定期）的平均值我有一些代码。要制作的代码意味着：

#10:00:00, 11:00:00, ... 18:00:00
times <- paste(seq(10, 18),":00:00", sep="")
#means - matrix of means for timeseries and hour
means <- matrix(ncol= ncol(time_series), nrow = length(times))
for (t in 1:length(times)) {
  #t is time 10 to 18
  for(i in 1:ncol(time_series)) {
    #i is stock 1 to 10
    # hour mean for each observation in data
    means[t,i] <- mean(time_series[grep(times[t], index(time_series)), i])
  }
}

我的功能是“完成任务”：

for (t in 1:length(times)) {
  # get all dates with times[t] hour
  hours <- time_series[grep(times[t], index(time_series))]
  ep <- endpoints(hours, "hours")

  out <- rbind(out, period.apply(hours, INDEX=ep, FUN=function(x) {
    x/means[t,]
  }))
}

我知道这很糟糕，但它确实有效。我怎样才能简化代码？

score 3 · Accepted Answer

这是一种方法：

# Split the xts object into chunks by hour
# .indexhour() returns the hourly portion for each timestamp
s <- split(time_series, .indexhour(time_series))
# Use sweep to divide each value of x by colMeans(x) for each group of hours
l <- lapply(s, function(x) sweep(x, 2, colMeans(x), FUN="/"))
# rbind everything back together
r <- do.call(rbind, l)

score 0 · Accepted Answer

该scale功能可以做到这一点。与ave您一起使用可以限制在数小时内进行计算。dput在 xts/zoo 对象上发布结果，您将得到快速回复。

r - 将观测值除以周期平均值。帮助简化代码

2 回答 2

Related

Reference