GridSearchCV
(无论是 fromsklearn
还是 from dask
)似乎参数有些奇怪或错误,导致 MLPRegressor 忽略该参数。
我用一个最小的工作示例来展示这种行为。
假设数字初始化features
,values
在我的情况下
print(features.shape)
print(values.shape)
(321278, 36)
(321278,)
并运行以下代码
from dask_ml.model_selection import GridSearchCV as daskGridSearchCV
from sklearn.model_selection import GridSearchCV as skGridSearchCV
from sklearn.neural_network import MLPRegressor
myparams = {'hidden_layer_sizes': [(2, ), (4, )]}
daskgridCV = daskGridSearchCV(estimator=MLPRegressor(), n_jobs=-1, param_grid=myparams)
daskbestfit = daskgridCV.fit(features, values)
skgridCV = skGridSearchCV(estimator=MLPRegressor(), n_jobs=-1, param_grid=myparams,cv=3)
skbestfit = skgridCV.fit(features, values)
display(daskbestfit)
display(skbestfit)
结果是
GridSearchCV(cache_cv=True, cv=None, error_score='raise',
estimator=MLPRegressor(activation='relu', alpha=0.0001,
batch_size='auto', beta_1=0.9, beta_2=0.999,
early_stopping=False, epsilon=1e-08,
hidden_layer_sizes=(100,),
learning_rate='constant',
learning_rate_init=0.001, max_iter=200,
momentum=0.9, n_iter_no_change=10,
nesterovs_momentum=True, power_t=0.5,
random_state=None, shuffle=True,
solver='adam', tol=0.0001,
validation_fraction=0.1, verbose=False,
warm_start=False),
iid=True, n_jobs=-1,
param_grid={'hidden_layer_sizes': [(2,), (4,)]}, refit=True,
return_train_score=False, scheduler=None, scoring=None)
GridSearchCV(cv=3, error_score='raise-deprecating',
estimator=MLPRegressor(activation='relu', alpha=0.0001,
batch_size='auto', beta_1=0.9, beta_2=0.999,
early_stopping=False, epsilon=1e-08,
hidden_layer_sizes=(100,),
learning_rate='constant',
learning_rate_init=0.001, max_iter=200,
momentum=0.9, n_iter_no_change=10,
nesterovs_momentum=True, power_t=0.5,
random_state=None, shuffle=True,
solver='adam', tol=0.0001,
validation_fraction=0.1, verbose=False,
warm_start=False),
iid='warn', n_jobs=-1,
param_grid={'hidden_layer_sizes': [(2,), (4,)]},
pre_dispatch='2*n_jobs', refit=True, return_train_score=False,
scoring=None, verbose=0)
因此在这两种情况下,hidden_layer_sizes
参数都具有(100,)
不在网格中的值。我做错了什么,或者这里发生了什么?
python-版本 3.6.9
sklearn-版本 0.21.2
dask_ml-版本 1.0.0