我想使用 auto_arima 因为我想要 SARIMAX 并且我希望它选择最好的参数。我的代码实际上是通过一堆不同的组合运行(因为它正在预测多个项目)并且它通过多个模型运行,因此最后我可以为每个组合选择“最佳”模型。我已经处理过它,在代码的前面创建了所有正确的组合(使用字典等),但是有没有办法从逐步模型中输出每个组合的最佳参数,以便它自动适合那些“最佳”参数?我知道您通常可以手动输入最佳参数,然后“拟合”模型,但有没有办法自动化?
results_dict = {}
preds_dict = {}
all_preds_df = pd.DataFrame({k: [] for k in df.columns})
all_preds_df2 = pd.DataFrame({k: [] for k in df2.columns})
all_preds_df3 = pd.DataFrame({k: [] for k in test.columns})
for itemnum in itemnumbers:
for depot in depots:
for department in departments:
try:
df = data_dict[itemnum][depot][department] #train
df2 = data_dict2[itemnum][depot][department] #test
train = data_dict_train[itemnum][depot][department] #time series
test = data_dict_test[itemnum][depot][department] #time series
print(itemnum, depot, department)
except KeyError:
continue
# Set X and y
X_train_date = df['Date']
X_train = df.drop(['QUANTITY', 'Date', 'DESCRIPTION', 'NAME', 'Max_SEASON_DESC'], axis=1)
y_train = df['QUANTITY']
X_test_date = df2['Date']
X_test = df2.drop(['QUANTITY', 'Date', 'DESCRIPTION', 'NAME', 'Max_SEASON_DESC'], axis=1)
y_test = df2['QUANTITY']
#Train and Test for Time Series - this is kind of not needed
train_data = train
test_data = test
#Scaled Data
# Establish model
#Linear Regression
#Random Forest
#Gradient Boosted Tree with Grid Search
parameters = {
"n_estimators":[1,5,10,25,50,100,250,500],
"max_depth":[1,3,5,7,9],
"learning_rate":[1, 2, 3, 4, 5],
"criterion": ['friedman_mse', 'mse', 'mae']}
gbr = GradientBoostingRegressor()
gbr_grid = GridSearchCV(estimator=gbr, param_grid=parameters, cv = 3)
gbr_grid.fit(X_train, y_train)
y_pred_gbr =gbr_grid.predict(X_train)
y_pred_test_gbr =gbr_grid.predict(X_test)
#Multinomial Naive Bayes
#Time Series
#Holt Winter
holt_winter = ExponentialSmoothing(np.asarray(train_data['Sum_QUANTITY']), seasonal_periods=12, trend='add', seasonal='add')
hw_fit = holt_winter.fit()
hw_forecast = hw_fit.forecast(len(test_data))
stepwise_model = auto_arima(train_data['Sum_QUANTITY'], start_p=1, start_q=1,
max_p=3, max_q=3, m=12,
start_P=0, seasonal=True,
d=1, D=1, trace=True,
error_action='ignore',
suppress_warnings=True,
stepwise=True)
sm_fit = stepwise_model.fit(train_data)
predictions_sm = sm_fit.forecast(len(test_data.Sum_QUANTITY))
predictions_sm = pd.Series(predictions_sm, index=test_data.index)
residuals_sm = test_data.Sum_QUANTITY - predictions_sm
# Create forecast columns
test['SM_Prediction'] = predictions_sm
#df2 = df2.merge(pd.DataFrame(y_pred_test), how='left', on=['Date'])
all_preds_df = pd.concat([all_preds_df, df])
all_preds_df2 = pd.concat([all_preds_df2, df2])
all_preds_df3 = pd.concat([all_preds_df3, test])
#Finish timer
stop = timeit.default_timer()
print('Time: ', stop - start)
代码是可以理解的。它在这中间产生这个最终导致这个错误 -
Performing stepwise search to minimize aic
ARIMA(1,1,1)(0,1,1)[12] : AIC=inf, Time=0.30 sec
ARIMA(0,1,0)(0,1,0)[12] : AIC=582.161, Time=0.01 sec
ARIMA(1,1,0)(1,1,0)[12] : AIC=571.883, Time=0.05 sec
ARIMA(0,1,1)(0,1,1)[12] : AIC=inf, Time=0.21 sec
ARIMA(1,1,0)(0,1,0)[12] : AIC=580.572, Time=0.01 sec
ARIMA(1,1,0)(2,1,0)[12] : AIC=inf, Time=0.49 sec
ARIMA(1,1,0)(1,1,1)[12] : AIC=inf, Time=0.32 sec
ARIMA(1,1,0)(0,1,1)[12] : AIC=inf, Time=0.20 sec
ARIMA(1,1,0)(2,1,1)[12] : AIC=541.710, Time=0.82 sec
ARIMA(1,1,0)(2,1,2)[12] : AIC=inf, Time=0.95 sec
ARIMA(1,1,0)(1,1,2)[12] : AIC=inf, Time=0.69 sec
ARIMA(0,1,0)(2,1,1)[12] : AIC=inf, Time=0.58 sec
ARIMA(2,1,0)(2,1,1)[12] : AIC=572.817, Time=0.33 sec
ARIMA(1,1,1)(2,1,1)[12] : AIC=inf, Time=0.81 sec
ARIMA(0,1,1)(2,1,1)[12] : AIC=541.502, Time=0.82 sec
ARIMA(0,1,1)(1,1,1)[12] : AIC=inf, Time=0.40 sec
ARIMA(0,1,1)(2,1,0)[12] : AIC=540.432, Time=0.49 sec
ARIMA(0,1,1)(1,1,0)[12] : AIC=572.374, Time=0.06 sec
ARIMA(0,1,0)(2,1,0)[12] : AIC=inf, Time=0.38 sec
ARIMA(1,1,1)(2,1,0)[12] : AIC=inf, Time=0.65 sec
ARIMA(0,1,2)(2,1,0)[12] : AIC=542.396, Time=0.50 sec
ARIMA(1,1,2)(2,1,0)[12] : AIC=inf, Time=1.12 sec
ARIMA(0,1,1)(2,1,0)[12] intercept : AIC=540.868, Time=0.54 sec
Best model: ARIMA(0,1,1)(2,1,0)[12]
Total fit time: 10.734 seconds
ValueError: could not convert string to float: '20/06/4'