1

我已经应用了gridsearchCV, , 的估计器DecisionTreeClassifierRandomForestClassifier在集成学习中使用了所有这些LogisticRegression估计器。XGBClassifier

在我的系统和我朋友的系统中使用相同的测试和训练数据,所有这些估计器给出的结果gridSearchCV是不同的,我不知道为什么?

我们使用相同的数据进行训练和测试,但gridsearch在两个系统中对这些数据给出不同的结果,只是想知道应该改变什么以使系统在任何系统上给出相同的结果?

gs_dt = GridSearchCV(estimator=DecisionTreeClassifier(random_state=42,class_weight={1:10, 0:1}),
                  param_grid=[{'max_depth': [ 2,  4,  6, 8, 10], 
                               'criterion':['gini','entropy'], 
                               "max_features":["auto", None], 
                               "max_leaf_nodes":[10,20,30,40]}],
                  scoring=scoring,
                  cv=10,
                  refit='recall')

gs_rf = GridSearchCV(estimator=RandomForestClassifier(n_jobs=-1, oob_score = True,class_weight={1: 10/11, 0: 1/11}),
                    param_grid=[{'max_depth': [4, 6, 8, 10, 12, 16, 20, None],
                                 'max_features': ['auto', 'sqrt'],
                                 'min_samples_leaf': [2, 4, 8], 
                                 'min_samples_split': [10, 20]}],
                    scoring=scoring,
                    cv=10,
                    n_jobs=4,
                    refit='recall')

gs_lr = GridSearchCV(estimator=LogisticRegression(multi_class='ovr',random_state=42,class_weight={1:10, 0:1}),
              param_grid=[{'C': [0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.1 ,1],
                          'penalty':['l1','l2']}],
              scoring=scoring,
              cv=10,
              refit='recall')

gs_gb = GridSearchCV(estimator=XGBClassifier(n_jobs=-1),
                    param_grid=[{'learning_rate': [0.01, 0.05, 0.1, 0.2],
                                 'max_depth': [4, 6, 8, 10, 12, 16, 20],
                                 'min_samples_leaf': [4, 8, 12, 16, 20], 
                                 'max_features': ['auto', 'sqrt']}],
                    scoring=scoring,
                    cv=10,
                    n_jobs=4,
                    refit='recall')

例如首先gridsearchcv在我的系统上给出这个结果:

DecisionTreeClassifier(class_weight={1: 10, 0: 1}, criterion='gini',
            max_depth=8, max_features=None, max_leaf_nodes=10,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, presort=False, random_state=42,
            splitter='best')

在我朋友的系统上,它给出了:

DecisionTreeClassifier(class_weight={0: 1, 1: 10}, criterion='gini',
                       max_depth=10, max_features=None, max_leaf_nodes=10,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, presort=False,
                       random_state=42, splitter='best')

同样,我在我和我朋友的系统上得到了不同的结果。

4

0 回答 0