1

TPOT声明Average CV score on the training set was: -128.90187963562252(neg_MAE)的导出管道。然而,用相同的训练集重新拟合管道会产生更小的 MAE,大约为 (35)。此外,预测看不见的测试集将产生一个大约 (140) 的 MAE,这与导出管道的说明一致。

我有点困惑,想知道如何在训练集上重现错误分数。

管道似乎过拟合了??

cv = RepeatedKFold(n_splits=4, n_repeats=1, random_state=1)
model = TPOTRegressor(generations=10, population_size=25, offspring_size=None, mutation_rate=0.9,
                      crossover_rate=0.1, scoring='neg_mean_absolute_error', cv=cv, 
                      subsample=0.75,n_jobs=-1, max_time_mins=None, 
                      max_eval_time_mins=5,random_state=42,config_dict=None, template=None, 
                      warm_start=False, memory=None, 
                      use_dask=False,periodic_checkpoint_folder=None, early_stop=3, verbosity=2,
                      disable_update_check=False, log_file=None)

model.fit(train_df[x], train_df[y])

# The Exported model
# Average CV score on the training set was: -128.90187963562252

exported_pipeline = make_pipeline(StackingEstimator(estimator=LassoLarsCV(normalize=True)),
                                  StackingEstimator(estimator=ExtraTreesRegressor(bootstrap=True, 
                                   max_features=0.4, min_samples_leaf=1,
                                   min_samples_spli`enter code here`t=7, n_estimators=100)),
                                   PolynomialFeatures(degree=2, include_bias=False, 
                                   interaction_only=False),
                                   ExtraTreesRegressor(bootstrap=True, 
                                   max_features=0.15000000000000002, min_samples_leaf=9, 
                                   min_samples_split=7,n_estimators=100))

# Fix random state for all the steps in exported pipeline

set_param_recursive(exported_pipeline.steps, 'random_state', 42)
exported_pipeline.fit(training_features, training_target) 
results = exported_pipeline.predict(testing_features)

提前致谢

4

0 回答 0