1

我目前正在使用 XGBoost 进行预测。我想知道哪一组超参数会提供最好的结果。我也使用了 optuna,但预测结果似乎不合时宜。

def objective(trial,data=X1,target=Y1):
    X1_train, X1_test, y1_train, y1_test = train_test_split(X1, Y1, test_size=0.2, random_state=100)
    param = {
        'tree_method':'exact',#Which one to use here : exact or approx?  
        'lambda': trial.suggest_loguniform('lambda', 1e-3,100.0),#What should be the range?
        'alpha': trial.suggest_loguniform('alpha', 1e-3,10.0),#What should be the range?
        'colsample_bytree': trial.suggest_categorical('colsample_bytree', [0.5,0.6,0.7,0.8,0.9,1.0]),
        'subsample': trial.suggest_categorical('subsample', [0.4,0.5,0.6,0.7,0.8,1.0]),
        'learning_rate': trial.suggest_categorical('learning_rate', [0.008,0.009,0.01,0.012,0.014,0.016,0.018, 0.02]),
        'n_estimators': 1000,
        'max_depth': trial.suggest_categorical('max_depth', [3,4,5,6,7,8,9,10]),
        'random_state': trial.suggest_categorical('random_state', [25,50,100]),#What should be the range?
        'min_child_weight': trial.suggest_int('min_child_weight', 1, 100),#What should be the range?
        'objective':'reg:squarederror'
            }
    eval_set = [(X1_train, y1_train), (X1_test, y1_test)]
    xg_reg1 = xgb.XGBRegressor(**param)  
    xg_reg1.fit(X1_train,y1_train, early_stopping_rounds=100, eval_metric=["rmse", "mae"], eval_set=eval_set, verbose=False)      
    preds = xg_reg1.predict(X1_test)    
    rmse = mean_squared_error(y1_test, preds,squared=False)    
    return rmse

使用 Optuna 优化超参数,如下所示:

tudy = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=50)
print('Number of finished trials:', len(study.trials))
print('Best trial:', study.best_trial.params)

最佳试验的参数在 XGBoost 中用于预测。

4

0 回答 0