TPOT 分类器和回归器提供了一个 scikit-learn 管道对象,该对象已经为您完成了这项工作。
如果您同时查看TPOT APITPOTClassifier
并TPOTRegressor
公开一个属性,该属性fitted_pipeline_
将包含 TPOT 可以找到的最佳 scikit-learn 管道。scikit-learn 管道的示例:
PolynomialFeatures(degree=2, include_bias=False, interaction_only=False),
XGBRegressor(learning_rate=0.1, max_depth=4, min_child_weight=14, n_estimators=100, n_jobs=1, objective="reg:squarederror", subsample=1.0, verbosity=0)
您可以将其转储以供以后加载,因此您不必重新训练您的模型,或者您可以简单地使用 TPOT 分类器和回归器内置函数导出最佳管道,将优化的管道导出为 Python 代码,这样您就可以重新-适合您的模型:
tpot.export('tpot_digits_pipeline.py')
如果由于某种原因您只在问题中发布了该输出,则可以像这样重新创建 scikit-learn 管道:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from xgboost import XGBRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.pipeline import make_pipeline
tpot_data = pd.read_csv('PATH/TO/DATA/FILE', sep='COLUMN_SEPARATOR', dtype=np.float64)
features = tpot_data.drop('target', axis=1)
training_features, testing_features, training_target, testing_target = \
train_test_split(features, tpot_data['target'], random_state=42)
exported_pipeline = make_pipeline(
RandomForestRegressor(XGBRegressor(XGBRegressor(<replace with actual arg list>), <replace with actual arg list>), <replace with actual arg list>)
)
exported_pipeline.fit(training_features, training_target)