6

试过 grid.cv_results_ 没有纠正问题

from sklearn.model_selection
import GridSearchCV
params = {
    'decisiontreeclassifier__max_depth': [1, 2],
    'pipeline-1__clf__C': [0.001, 0.1, 100.0]
}
grid = GridSearchCV(estimator = mv_clf,
    param_grid = params,
    cv = 10,
    scoring = 'roc_auc')
grid.fit(X_train, y_train)
for params, mean_score, scores in grid.grid_scores_:
    print("%0.3f+/-%0.2f %r" %
        (mean_score, scores.std() / 2, params))
#AttributeError: 'GridSearchCV' object has no attribute 'grid_scores_'

尝试替换grid.grid_scores_grid.cv_results_ 目标是打印不同的超参数值组合和通过 10 倍交叉验证计算的平均 ROC AUC 分数

from sklearn.model_selection
    import GridSearchCV
    params = {
        'decisiontreeclassifier__max_depth': [1, 2],
        'pipeline-1__clf__C': [0.001, 0.1, 100.0]
    }
    grid = GridSearchCV(estimator = mv_clf,
        param_grid = params,
        cv = 10,
        scoring = 'roc_auc')
    grid.fit(X_train, y_train)
    for params, mean_score, scores in grid.grid_scores_:
        print("%0.3f+/-%0.2f %r" %
            (mean_score, scores.std() / 2, params))
    #AttributeError: 'GridSearchCV' object has no attribute 'grid_scores_'
4

1 回答 1

9

在最新的 scitkit-learn libaray 中,grid_scores_已被贬值,并已被cv_results_取代

cv_results_ 给出网格搜索运行的详细结果。

grid.cv_results_.keys()

Output: dict_keys(['mean_fit_time', 'std_fit_time', 'mean_score_time', 'std_score_time', 'param_n_estimators', 'params', 'split0_test_score', 
'split1_test_score', 'split2_test_score', 'split3_test_score', 'split4_test_score',
'mean_test_score', 'std_test_score', 'rank_test_score'])

与 grid_score 相比,cv_results_ 给出了详细的输出。结果输出以字典的形式。我们可以通过遍历字典的键从字典中提取相关指标。下面是为 cv=5 运行网格搜索的示例

 for i in ['mean_test_score', 'std_test_score', 'param_n_estimators']:
        print(i," : ",grid.cv_results_[i])

 Output:   mean_test_score  :  [0.833 0.83 0.83 0.837 0.838 0.8381 0.83]
           std_test_score  :  [0.011 0.009 0.010 0.0106 0.010 0.0102 0.0099]
           param_n_estimators  :  [20 30 40 50 60 70 80]
于 2019-11-15T13:03:01.543 回答