3

我对 R 包相当陌生,我正在处理时间序列。我必须建立一个预测模型来预测未来的点击次数。预测的时间间隔必须是每小时一次。

我的示例时间序列:

 DateTime            Clicks


(06/23/13 00:00:00)  757
(06/23/13 01:00:00)  714
(06/23/13 02:00:00)  776
(06/23/13 03:00:00)  870
(06/23/13 04:00:00) 1263
(06/23/13 05:00:00) 1457
(06/23/13 06:00:00) 1621
(06/23/13 07:00:00) 1606
(06/23/13 08:00:00) 1779
(06/23/13 09:00:00) 1832
(06/23/13 10:00:00) 1808
(06/23/13 11:00:00) 1789
(06/23/13 12:00:00) 1907
(06/23/13 13:00:00) 2021
(06/23/13 14:00:00) 2018
(06/23/13 15:00:00) 1836
(06/23/13 16:00:00) 1627
(06/23/13 17:00:00) 1331
(06/23/13 18:00:00) 1059
(06/23/13 19:00:00)  817
(06/23/13 20:00:00)  761
(06/23/13 21:00:00)  781
(06/23/13 22:00:00)  752
(06/23/13 23:00:00)  725
(06/24/13 00:00:00)  708
(06/24/13 01:00:00)  718
(06/24/13 02:00:00)  791
(06/24/13 03:00:00)  857
(06/24/13 04:00:00) 1094
(06/24/13 05:00:00) 1247
(06/24/13 06:00:00) 1316
(06/24/13 07:00:00) 1401
(06/24/13 08:00:00) 1575
(06/24/13 09:00:00) 1604
(06/24/13 10:00:00) 1774
(06/24/13 11:00:00) 1865
(06/24/13 12:00:00) 1964
(06/24/13 13:00:00) 2002
(06/24/13 14:00:00) 2043
(06/24/13 15:00:00) 2030
(06/24/13 16:00:00) 1733
(06/24/13 17:00:00) 1420
(06/24/13 18:00:00) 1075
(06/24/13 19:00:00)  831
(06/24/13 20:00:00)  789
(06/24/13 21:00:00)  791
(06/24/13 22:00:00)  715
(06/24/13 23:00:00)  683
(06/25/13 00:00:00)  802
(06/25/13 01:00:00)  811
(06/25/13 02:00:00)  838
(06/25/13 03:00:00)  851
(06/25/13 04:00:00) 1064
(06/25/13 05:00:00) 1191
(06/25/13 06:00:00) 1242
(06/25/13 07:00:00) 1233
(06/25/13 08:00:00) 1452
(06/25/13 09:00:00) 1501
(06/25/13 10:00:00) 1718
(06/25/13 11:00:00) 1861
(06/25/13 12:00:00) 1896
(06/25/13 13:00:00) 2073
(06/25/13 14:00:00) 2279
(06/25/13 15:00:00) 2239
(06/25/13 16:00:00) 2018
(06/25/13 17:00:00) 1550
(06/25/13 18:00:00) 1182
(06/25/13 19:00:00) 1063
(06/25/13 20:00:00)  973
(06/25/13 21:00:00) 1027
(06/25/13 22:00:00)  961
(06/25/13 23:00:00)  890
(06/26/13 00:00:00)  894
(06/26/13 01:00:00)  835
(06/26/13 02:00:00)  852
(06/26/13 03:00:00)  893
(06/26/13 04:00:00) 1111
(06/26/13 05:00:00) 1239
(06/26/13 06:00:00) 1263
(06/26/13 07:00:00) 1260
(06/26/13 08:00:00) 1451
(06/26/13 09:00:00) 1556
(06/26/13 10:00:00) 1733
(06/26/13 11:00:00) 1981
(06/26/13 12:00:00) 2063
(06/26/13 13:00:00) 2150
(06/26/13 14:00:00) 2278
(06/26/13 15:00:00) 2188
(06/26/13 16:00:00) 1980
(06/26/13 17:00:00) 1611
(06/26/13 18:00:00) 1381
(06/26/13 19:00:00) 1211
(06/26/13 20:00:00) 1129
(06/26/13 21:00:00) 1092
(06/26/13 22:00:00) 1009
(06/26/13 23:00:00)  973
(06/27/13 00:00:00)  865
(06/27/13 01:00:00)  805
(06/27/13 02:00:00)  840
(06/27/13 03:00:00)  813
(06/27/13 04:00:00) 1010
(06/27/13 05:00:00) 1201
(06/27/13 06:00:00) 1329
(06/27/13 07:00:00) 1343
(06/27/13 08:00:00) 1532
(06/27/13 09:00:00) 1612
(06/27/13 10:00:00) 1768
(06/27/13 11:00:00) 1977
(06/27/13 12:00:00) 2089
(06/27/13 13:00:00) 2247
(06/27/13 14:00:00) 2270
(06/27/13 15:00:00) 2275
(06/27/13 16:00:00) 2155
(06/27/13 17:00:00) 1639
(06/27/13 18:00:00) 1315
(06/27/13 19:00:00) 1099
(06/27/13 20:00:00) 1052
(06/27/13 21:00:00) 1099
(06/27/13 22:00:00)  965
(06/27/13 23:00:00)  961
(06/28/13 00:00:00)  765
(06/28/13 01:00:00)  830
(06/28/13 02:00:00)  874
(06/28/13 03:00:00)  845
(06/28/13 04:00:00) 1011
(06/28/13 05:00:00) 1160
(06/28/13 06:00:00) 1232
(06/28/13 07:00:00) 1310
(06/28/13 08:00:00) 1467
(06/28/13 09:00:00) 1639
(06/28/13 10:00:00) 1704
(06/28/13 11:00:00) 3704
(06/28/13 12:00:00) 7350
(06/28/13 13:00:00) 7629
(06/28/13 14:00:00) 7570
(06/28/13 15:00:00) 7276
(06/28/13 16:00:00) 7189
(06/28/13 17:00:00) 7139
(06/28/13 18:00:00) 7167
(06/28/13 19:00:00) 6871
(06/28/13 20:00:00) 6575
(06/28/13 21:00:00) 6112
(06/28/13 22:00:00) 5276
(06/28/13 23:00:00) 4407
(06/29/13 00:00:00) 3741
(06/29/13 01:00:00) 3427
(06/29/13 02:00:00) 3311
(06/29/13 03:00:00) 3096
(06/29/13 04:00:00) 3010
(06/29/13 05:00:00) 3301
(06/29/13 06:00:00) 3783
(06/29/13 07:00:00) 4578
(06/29/13 08:00:00) 5599
(06/29/13 09:00:00) 6590
(06/29/13 10:00:00) 6998
(06/29/13 11:00:00) 7323
(06/29/13 12:00:00) 7282
(06/29/13 13:00:00) 7009
(06/29/13 14:00:00) 6636
(06/29/13 15:00:00) 6407
(06/29/13 16:00:00) 6386
(06/29/13 17:00:00) 6505
(06/29/13 18:00:00) 3104
(06/29/13 19:00:00)  939
(06/29/13 20:00:00)  915
(06/29/13 21:00:00)  955
(06/29/13 22:00:00)  968
(06/29/13 23:00:00)  870
(06/30/13 00:00:00) 3504
(06/30/13 01:00:00) 3122
(06/30/13 02:00:00) 2874
(06/30/13 03:00:00) 2613
(06/30/13 04:00:00) 2905
(06/30/13 05:00:00) 2806
(06/30/13 06:00:00) 3244
(06/30/13 07:00:00) 3789
(06/30/13 08:00:00) 5015
(06/30/13 09:00:00) 6031
(06/30/13 10:00:00) 6841
(06/30/13 11:00:00) 7014
(06/30/13 12:00:00) 7265
(06/30/13 13:00:00) 7460
(06/30/13 14:00:00) 7275
(06/30/13 15:00:00) 7531
(06/30/13 16:00:00) 7013
(06/30/13 17:00:00) 6637
(06/30/13 18:00:00) 5770
(06/30/13 19:00:00) 5593
(06/30/13 20:00:00) 6524
(06/30/13 21:00:00) 5081
(06/30/13 22:00:00) 1131
(06/30/13 23:00:00)  949

这是一个以小时为单位的整周时间序列。考虑到这些数据,我需要使用 Holt-Winters 预测下一小时(06/28/13 00:00:00)将获得的点击次数。我试图弄清楚,但我真的很困惑。如果有人能指出我正确的方向,我将不胜感激。

编辑:

我正在使用 Holt-Winters 预测模块,如下所示:

search_fit <- HoltWinters(z)
p = predict(search_fit,24)

但问题是 HOLT-WINTERS 没有检测到预测中的任何趋势。正常吗?因为从 2013 年 6 月 28 日开始发生了很大的变化。

以下是我的预测值:

Time Series:
Start = c(15887, 1) 
End = c(15887, 24) 
Frequency = 24 
            fit
 [1,]  927.6462
 [2,]  935.2716
 [3,] 1006.5636
 [4,] 1066.4182
 [5,] 1295.5852
 [6,] 1442.9397
 [7,] 1508.1693
 [8,] 1590.9613
 [9,] 1762.5033
[10,] 1789.1287
[11,] 1958.1083
[12,] 2049.1711
[13,] 2054.7757
[14,] 2168.1302
[15,] 2163.1514
[16,] 1979.5268
[17,] 1772.7355
[18,] 1483.0484
[19,] 1220.1946
[20,]  987.2366
[21,]  938.1745
[22,]  965.5915
[23,]  940.4669
[24,]  911.0089

这是预测图。 在此处输入图像描述

趋势成分没有变化。在使用 Holt-Winters 的预测方法时,我可能做错了一些事情。

4

2 回答 2

4

以下是一些建议/尝试:

您没有看到HoltWinters趋势的一个原因可能是您所指的更改(在 2013 年 6 月 28 日在您的数据中)正在级别组件中表现出来。从 6 月 28 日中午开始,点击次数发生了重大变化。

一注:HoltWinters()将吐出平滑参数(alpha,beta,gamma)。它会给你斜率b。如果beta为 0,则仅表示趋势不会随时间序列发生变化。它以斜率 b 开始,并继续保持相同的斜率。

要尝试的事情

尝试这个:

 > library(forecast)
 > accuracy(search_fit)

检查自相关也是一个好主意:

 > acf(search_fit$residuals, lag.max=24)

一些可能有助于您分析的一般性评论

  • 除了 HoltWinters(指数平滑),您可能还想研究 ARIMA 方法。正如 Rob Hyndman 在他的教科书第 8 章开头所说:

指数平滑和 ARIMA 模型是两种最广泛使用的时间序列预测方法,并为该问题提供了补充方法。虽然指数平滑模型基于对数据趋势和季节性的描述,但 ARIMA 模型旨在描述数据中的自相关。

希望这可以帮助。

于 2013-07-17T07:06:35.267 回答
0

尝试 auto.arima 或 ets 自动时间序列参见此处的代码 http://rpubs.com/newajay/timeseries

于 2017-05-21T08:16:31.670 回答