2

我有一个时间序列,我想每 1 小时自动执行一次平均。我的数据包括温度和日期时间(时间戳)
我不想要移动平均线,我想要 1、2、3、4 点的平均值,因为数据的频率通常是一天 2 分钟。

 temperature    date_time
1     -1.52 2007-09-29 00:00:08
2     -1.48 2007-09-29 00:02:08
3     -1.46 2007-09-29 00:04:08
4     -1.56 2007-09-29 00:06:08
5     -1.64 2007-09-29 00:08:08
6     -1.75 2007-09-29 00:10:08
7     -1.74 2007-09-29 00:12:08
8     -2.02 2007-09-29 00:14:08
9     -2.02 2007-09-29 00:16:08
10    -1.90 2007-09-29 00:18:08
11    -1.66 2007-09-29 00:20:08
12    -1.80 2007-09-29 00:22:08
13    -1.68 2007-09-29 00:24:08
14    -1.81 2007-09-29 00:26:08
15    -1.77 2007-09-29 00:28:08
16    -1.83 2007-09-29 00:30:08
17    -1.84 2007-09-29 00:32:08
18    -1.93 2007-09-29 00:34:08
19    -1.62 2007-09-29 00:36:08
20    -1.87 2007-09-29 00:38:08
21    -1.54 2007-09-29 00:40:08
22    -1.93 2007-09-29 00:42:08
23    -1.88 2007-09-29 00:44:08
24    -1.82 2007-09-29 00:46:08
25    -1.78 2007-09-29 00:48:08
26    -1.67 2007-09-29 00:50:08
27    -1.67 2007-09-29 00:52:08
28    -1.56 2007-09-29 00:54:08
29    -1.84 2007-09-29 00:56:08
30    -1.74 2007-09-29 00:58:08
31    -1.79 2007-09-29 01:00:08
32    -1.82 2007-09-29 01:02:08
33    -1.78 2007-09-29 01:04:08
34    -1.88 2007-09-29 01:06:08
35    -1.84 2007-09-29 01:08:08
36    -1.78 2007-09-29 01:10:08
37    -1.94 2007-09-29 01:12:08
38    -1.80 2007-09-29 01:14:08
39    -1.74 2007-09-29 01:16:08
40    -1.76 2007-09-29 01:18:08
41    -1.80 2007-09-29 01:20:08
42    -1.60 2007-09-29 01:22:08
43    -1.59 2007-09-29 01:24:08
44    -1.52 2007-09-29 01:26:08
45    -1.41 2007-09-29 01:28:08
46    -1.42 2007-09-29 01:30:08
47    -1.44 2007-09-29 01:32:08
48    -1.38 2007-09-29 01:34:08
49    -1.34 2007-09-29 01:36:08
50    -1.40 2007-09-29 01:38:08
51    -1.40 2007-09-29 01:40:08
52    -1.48 2007-09-29 01:42:08
53    -1.36 2007-09-29 01:44:08
54    -1.42 2007-09-29 01:46:08
55    -1.46 2007-09-29 01:48:08
56    -1.46 2007-09-29 01:50:08
57    -1.47 2007-09-29 01:52:08
58    -1.50 2007-09-29 01:54:08
59    -1.51 2007-09-29 01:56:08
60    -1.49 2007-09-29 01:58:08
61    -1.54 2007-09-29 02:00:08
62    -1.50 2007-09-29 02:02:08
63    -1.55 2007-09-29 02:04:08
64    -1.52 2007-09-29 02:06:08
65    -1.66 2007-09-29 02:08:08
66    -1.88 2007-09-29 02:10:08
67    -1.72 2007-09-29 02:12:08
68    -1.68 2007-09-29 02:14:08
69    -1.68 2007-09-29 02:16:08
70    -1.60 2007-09-29 02:18:08
71    -1.71 2007-09-29 02:20:08
72    -1.71 2007-09-29 02:22:08
73    -1.80 2007-09-29 02:24:08
74    -1.76 2007-09-29 02:26:08
75    -1.84 2007-09-29 02:28:08
76    -1.96 2007-09-29 02:30:08
77    -2.06 2007-09-29 02:32:08
78    -2.16 2007-09-29 02:34:08
79    -2.04 2007-09-29 02:36:08
80    -1.93 2007-09-29 02:38:08
81    -1.98 2007-09-29 02:40:08
82    -1.86 2007-09-29 02:42:08
83    -2.08 2007-09-29 02:44:08
84    -1.78 2007-09-29 02:46:08
85    -1.50 2007-09-29 02:48:08
86    -1.60 2007-09-29 02:50:08
87    -1.53 2007-09-29 02:52:08
88    -1.76 2007-09-29 02:54:08
89    -1.64 2007-09-29 02:56:08
90    -1.52 2007-09-29 02:58:08
91    -1.82 2007-09-29 03:00:08
4

2 回答 2

11

假设您的数据集被调用temp并且您的“date_time”变量是正确的日期格式(使用完成,例如as.POSIXlt(temp$date_time),那么您可以简单地使用aggregatecut获取每小时摘要:

aggregate(list(temperature = temp$temperature), 
          list(hourofday = cut(temp$date_time, "1 hour")), 
          mean)
#             hourofday temperature
# 1 2007-09-29 00:00:00   -1.744333
# 2 2007-09-29 01:00:00   -1.586000
# 3 2007-09-29 02:00:00   -1.751667
# 4 2007-09-29 03:00:00   -1.820000
于 2012-12-17T14:22:34.730 回答
6

由于您操作时间序列,您可以使用包 xts(或 zoo,或 ts)

在这里,我假设您的数据是:

 head(dat)
     V2         V3       V4
2 -1.52 2007-09-29 00:00:08
3 -1.48 2007-09-29 00:02:08
4 -1.46 2007-09-29 00:04:08
5 -1.56 2007-09-29 00:06:08
6 -1.64 2007-09-29 00:08:08
7 -1.75 2007-09-29 00:10:08

首先我构造 xts 变量

  library(xts)
  dat.xts <- xts(x = dat$V2,as.POSIXct(paste(dat$V3,dat$V4)))


 head(dat.xts)
                     [,1]
2007-09-29 00:00:08 -1.52
2007-09-29 00:02:08 -1.48
2007-09-29 00:04:08 -1.46
2007-09-29 00:06:08 -1.56
2007-09-29 00:08:08 -1.64
2007-09-29 00:10:08 -1.75

然后我使用period.apply, 与 apply 系列的其余部分类似,在给定一组移动数据值的情况下计算指定的函数值

ep <- endpoints(dat.xts,'hours')
period.apply(dat.xts,ep,mean)
                         [,1]
2007-09-29 00:58:08 -1.744333
2007-09-29 01:58:08 -1.586000
2007-09-29 02:58:08 -1.751667
2007-09-29 03:00:08 -1.820000

例如,要计算每周平均值,您只需更改您的 ep (端点)

ep <- endpoints(dat.xts,'weeks')
period.apply(dat.xts,ep,mean)

                    [,1]
2007-09-29 03:00:08 -1.695385

plot(dat.xts)

在此处输入图像描述

于 2012-12-17T14:23:25.713 回答