我目前正在使用 XGBoost 进行预测。我想知道哪一组超参数会提供最好的结果。我也使用了 optuna,但预测结果似乎不合时宜。
def objective(trial,data=X1,target=Y1):
X1_train, X1_test, y1_train, y1_test = train_test_split(X1, Y1, test_size=0.2, random_state=100)
param = {
'tree_method':'exact',#Which one to use here : exact or approx?
'lambda': trial.suggest_loguniform('lambda', 1e-3,100.0),#What should be the range?
'alpha': trial.suggest_loguniform('alpha', 1e-3,10.0),#What should be the range?
'colsample_bytree': trial.suggest_categorical('colsample_bytree', [0.5,0.6,0.7,0.8,0.9,1.0]),
'subsample': trial.suggest_categorical('subsample', [0.4,0.5,0.6,0.7,0.8,1.0]),
'learning_rate': trial.suggest_categorical('learning_rate', [0.008,0.009,0.01,0.012,0.014,0.016,0.018, 0.02]),
'n_estimators': 1000,
'max_depth': trial.suggest_categorical('max_depth', [3,4,5,6,7,8,9,10]),
'random_state': trial.suggest_categorical('random_state', [25,50,100]),#What should be the range?
'min_child_weight': trial.suggest_int('min_child_weight', 1, 100),#What should be the range?
'objective':'reg:squarederror'
}
eval_set = [(X1_train, y1_train), (X1_test, y1_test)]
xg_reg1 = xgb.XGBRegressor(**param)
xg_reg1.fit(X1_train,y1_train, early_stopping_rounds=100, eval_metric=["rmse", "mae"], eval_set=eval_set, verbose=False)
preds = xg_reg1.predict(X1_test)
rmse = mean_squared_error(y1_test, preds,squared=False)
return rmse
使用 Optuna 优化超参数,如下所示:
tudy = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=50)
print('Number of finished trials:', len(study.trials))
print('Best trial:', study.best_trial.params)
最佳试验的参数在 XGBoost 中用于预测。