0

我想通过评分标签“1”来使用 GridSearchCV 以获得最佳 f1 分数,但不知何故它针对另一个指标进行了优化,我不明白,这是我的代码;

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import make_scorer, f1_score
f1_scorer = make_scorer(f1_score, pos_label=1)
param_random = {'random_state': range(0,10)}
clf = GridSearchCV(RandomForestClassifier(n_estimators=1, max_features=1), param_random, scoring=f1_scorer)

输出,

Best parameters:  {'random_state': 8}
predict time: 0.0 s
accuracy: 0.840909090909
             precision    recall  f1-score   support

        0.0       0.88      0.95      0.91        38
        1.0       0.33      0.17      0.22         6

avg / total       0.80      0.84      0.82        44

第二次尝试,只是改变'random_sate',

f1_scorer = make_scorer(f1_score, pos_label=1)
param_random = {'random_state': range(0,100)}
clf = GridSearchCV(RandomForestClassifier(n_estimators=1, max_features=1), param_random, scoring=f1_scorer)

输出,

Best parameters:  {'random_state': 91}
predict time: 0.0 s
accuracy: 0.886363636364
                 precision    recall  f1-score   support

        0.0       0.88      1.00      0.94        38
        1.0       1.00      0.17      0.29         6

avg / total       0.90      0.89      0.85        44

第三次尝试,

f1_scorer = make_scorer(f1_score, pos_label=1)
param_random = {'random_state': range(0,1000)}
clf = GridSearchCV(RandomForestClassifier(n_estimators=1, max_features=1), param_random, scoring=f1_scorer)

输出,

Best parameters:  {'random_state': 273}
predict time: 0.001 s
accuracy: 0.840909090909
             precision    recall  f1-score   support

        0.0       0.90      0.92      0.91        38
        1.0       0.40      0.33      0.36         6

avg / total       0.83      0.84      0.83        44

所以,起初,我认为它根据标签'0'进行优化,它没有。我不明白我做错了什么。虽然看起来还不错,但我知道在这个范围内至少有一个更好的分数。

我怎么知道错了,因为我可以手动找到更好的,

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import make_scorer, f1_score
f1_scorer = make_scorer(f1_score, pos_label=1)
param_random = {'random_state': range(2,3)}
clf = GridSearchCV(RandomForestClassifier(n_estimators=1, max_features=1), param_random, scoring=f1_scorer)

Best parameters:  {'random_state': 2}
predict time: 0.0 s
accuracy: 0.886363636364
             precision    recall  f1-score   support

        0.0       0.90      0.97      0.94        38
        1.0       0.67      0.33      0.44         6

avg / total       0.87      0.89      0.87        44
4

0 回答 0