0

这些是在 30 天内每小时获取的一系列值,我将它们按每小时一组收集,如下所示 2 组:

{'date':
['2019-11-09','2019-11-10','2019-11-11','2019-11-12','2019-11-13','2019-11-14','2019-11-15','2019-11-16','2019-11-17','2019-11-18','2019-11-19','2019-11-20','2019-11-21','2019-11-22','2019-11-23','2019-11-24','2019-11-25','2019-11-26','2019-11-27','2019-11-28','2019-11-29','2019-11-30','2019-12-01','2019-12-02','2019-12-03','2019-12-04','2019-12-05','2019-12-06','2019-12-07','2019-12-08'],
'hora0':[111666.5,121672.91666666667,87669.33333333333,89035.58333333333,91707.91666666667,94449.33333333333,103476.91666666667,123271.5,133306.58333333334,103149.91666666667,106310.25,91830.25,77733.75,96823.25,102880.25,118383.33333333333,95076.66666666667,93561.83333333333,97651.58333333333,112180.0,118051.75,135456.0,149553.0,125797.25,126098.0,128603.75,84631.08333333333,85683.16666666667,96377.16666666667,113161.16666666667],
'hora2':[83768.83333333333,83319.58333333333,72922.75,71893.75,73933.0,76598.83333333333,81021.75,93588.83333333333,94514.08333333333,87147.66666666667,91464.08333333333,74022.41666666667,63709.166666666664,75939.33333333333,79904.16666666667,84435.33333333333,76736.0,85237.33333333333,79162.75,91729.58333333333,99081.58333333333,106440.41666666667,112064.66666666667,111635.58333333333,110168.58333333333,111241.25,62634.083333333336,68203.33333333333,71515.16666666667,80674.66666666667]}

系列具有类似的分布: 小时样品 30 天

AIC 值是 Akaike 信息准则,它将预测模型相互比较。用于测试不同 ARIMA 模型并计算一系列 ARIMA 模型以查看哪个 AIC 值最低的代码

def AIC_iteration_i(train):
filterwarnings("ignore")
#X = df2.values
history = [x for x in train.iloc[:,0]]
p = d = q = range(0,6)
pdq = list(product(p,d,q))
aic_results = []
parameter = []
for param in pdq:
try:
model = ARIMA(history, order=param)
results = model.fit(disp=0)
# You can print each (p,d,q) parameters uncommented line below 
#print('ARIMA{} - AIC:{}'.format(param, results.aic))
aic_results.append(results.aic)
parameter.append(param)
except:
continue
d = dict(ARIMA=parameter, AIC=aic_results)
results_table = pd.DataFrame(dict([ (k, pd.Series(v)) for k,v in d.items()]))
# AIC minimum value
order = results_table.loc[results_table['AIC'].idxmin()][0]
return order

它为每个系列的具有最低 AIC 值(0, 2, 1)的参数返回相同的顺序。(p,d,q)

我用下面的代码得到它的预测,但结果在第 2 小时不行

# time series hora0.iloc[:,0] and hora1.iloc[:,0] from pandas df
trained = list(hora0.iloc[:,0])

# order got it above (0,2,1)
orders = order 

size = math.ceil(len(trained)*.8)
train, test = [trained[i] for i in range(size)] , [trained[i] for i in range(size,len(trained))]
predictions = []
predictionslower = []
predictionsupper = []
for k in range(len(test)):
model = ARIMA(trained, order=orders)
model_fit = model.fit(disp=0)
forecast, stderr, conf_int = model_fit.forecast()
yhat = forecast[0]
yhatlower = conf_int[0][0]
yhatupper = conf_int[0][1]
predictions.append(yhat)
predictionslower.append(yhatlower)
predictionsupper.append(yhatupper)
obs = test[k]
trained.append(obs)
#error = mean_squared_error(test, predictions)
predictions

预测

hour0 [113815.15072419723,128600.77967037176,131580.85654685542,83200.24743417211,83167.65192576911,95062.06180437957]`
prediction for `hour1 [79564.70753715932,112491.2694928094,114410.34654966182,60882.18766484651,nan,nan]

系列 2 的 AIC 还检查pmd-arima了 SARIMAX 模型的哪个顺序是相同的值。请给我一些光。

4

1 回答 1

1

小时 2(也包括其他小时)中数据的值在时间序列中是非平稳的,为了消除非平稳,我们可以对原始数据应用微分或自然对数:

hora2 = np.log('hora2')

{'date':['2019-11-09','2019-11-10','2019-11-11','2019-11-12','2019-11-13','2019-11-14','2019-11-15','2019-11-16','2019-11-17','2019-11-18','2019-11-19','2019-11-20','2019-11-21','2019-11-22','2019-11-23','2019-11-24','2019-11-25','2019-11-26','2019-11-27','2019-11-28','2019-11-29','2019-11-30','2019-12-01','2019-12-02','2019-12-03','2019-12-04','2019-12-05','2019-12-06','2019-12-07','2019-12-08'],
'hora2':[11.3358163,11.33043889,11.19715594,11.18294461,11.21091456,11.24633712,11.30247292,11.44666635,11.45650413,11.37535928,11.42370164,11.21212325,11.06208373,11.23769005,11.28858328,11.34374123,11.24812624,11.3531948,11.27926114,11.42660022,11.50369886,11.57534064,11.62683136,11.62299513,11.60976705,11.61945655,11.04506487,11.13024872,11.17766483,11.29817989]}

一旦获得ARIMA(trained, order=orders)每个“horaX”系列的最小 AIC 值(Akaike 信息准则)模型的顺序。一些系列仍然NaN在预测中返回值,我不得不取第二个或第三个最小的 AIC 值,预测结果返回,应用指数对数恢复原始值。

{'hora2':[11.6948938,12.00191037,11.81401922,11.77476296,11.83965601,11.89443423]}

hora2 = np.exp('hora2')

{'hora2':[119957.62142129,163066.00981609,135133.60347713,129931.53854787,138642.78415756,146449.24980086]}

测试数据的预测结果如图所示:

在此处输入图像描述

于 2020-01-01T19:22:53.083 回答