python - 当我在管道上尝试 LabelEncoder 时，为什么管道会抛出 FitFailedWarining？

Question

我是机器学习的新手，并试图制作一个项目来让我忙碌，所以我不太了解它是如何sklearn工作的。主要目标是训练模型来预测分类变量。当我尝试模型labelEncoding的y变量时，出现以下错误：

ValueError: not enough values to unpack (expected 3, got 2)

  FitFailedWarning)

这是我正在使用的代码

#Rough training

cols_to_use = [col for col in formatData.columns if col not in 'type1']
x = formatData[cols_to_use]
y = formatData.type1
#print(x.columns)
#print(y)


numerical_transformer = SimpleImputer(strategy='constant')
categorical_tansformer = Pipeline(steps=[
                                        ('imputer', SimpleImputer(strategy='most_frequent')),
                                        ('label', LabelEncoder())
                                        ])


preprocessor = ColumnTransformer(transformers=[('num',numerical_transformer),('cat',categorical_tansformer)])

my_pipeline = Pipeline(steps=[('preprocessor',preprocessor),
                              ('model',RandomForestRegressor(n_estimators=50,random_state=0))])

from sklearn.model_selection import cross_validate
from sklearn.model_selection import cross_val_predict

cv_results = cross_validate(my_pipeline,x,y,cv=5,scoring=('r2','neg_mean_absolute_error'))

predictions = cross_val_predict(my_pipeline,x,y,cv=5)
print(cv_results['test_neg_mean_absolute_error'])
print(predictions)

任何帮助表示赞赏，如果您需要更多信息，请发表评论。

score 0 · Accepted Answer

管道旨在转换X，而不是y。（围绕这个有一些讨论，尤其是在应该改变行X和y一起的重采样器中；请参阅imblearn至少在那个方向上的修复。）

特别是，fit_transform(X, y)默认定义为fit(X, y).transform(X). 所以LabelEncoder在管道中会尝试转换X，并且会失败，因为它不知道如何处理 2D 输入。y您应该只在管道之外标记编码。

python - 当我在管道上尝试 LabelEncoder 时，为什么管道会抛出 FitFailedWarining？

1 回答 1

Related

Reference