0

我有一个包含特征及其标签的数据集。

它看起来像这样:

X1, X2, X3, X4, X5 .. Xn L1, L2, L3
Y1, Y2, Y3, Y4, Y5 .. Yn L5, L2
..

我想在这个数据集上训练一个 KNeighborsClassifier。似乎 sklearn 不采用多标签。我一直在尝试这个:

mlb = MultiLabelBinarizer()
Y = mlb.fit_transform(Y)

# parameters:  n_neighbors=[5,15], weights = 'uniform', 'distance'
bagging = BaggingClassifier(KNeighborsClassifier(n_neighbors =5,weights ='uniform'), max_samples = 0.6, max_features= 0.7, verbose =1, oob_score =True)
scores = cross_val_score(bagging, X, Y, verbose =1, cv=3, n_jobs=3, scoring='f1_macro')

它给了我ValueError: bad input shape

有没有办法可以在 sklearn 中运行多标签分类器?

4

3 回答 3

2

根据 sklearn文档,支持多输出多类分类任务的分类器是:

决策树、随机森林、最近邻

于 2015-10-28T12:37:16.277 回答
2

由于您的标签有一个二进制矩阵,因此您可以使用它OneVsRestClassifierBaggingClassifier处理多标签预测。代码现在应该如下所示:

bagging = BaggingClassifier(KNeighborsClassifier(n_neighbors=5, weights='uniform'), max_samples=0.6, max_features=0.7, verbose=1, oob_score=True)
clf = OneVsRestClassifier(bagging)
scores = cross_val_score(clf, X, Y, verbose=1, cv=3, n_jobs=3, scoring='f1_macro')

您可以将OneVsRestClassifier与任何 sklearn 模型一起使用来进行多标签分类。

这是一个解释:

http://scikit-learn.org/stable/modules/multiclass.html#one-vs-the-rest

这里是文档:

http://scikit-learn.org/stable/modules/generated/sklearn.multiclass.OneVsRestClassifier.html

于 2015-12-09T15:37:35.263 回答
1

For anybody who finds this looking for multi-label KNN (MLKNN) options, I would recommend using skmultilearn, which is built on top of sklearn, so easy to use if you are familiar with the latter package.

Documentation here. This example is from the documentation:

from skmultilearn.adapt import MLkNN

classifier = MLkNN(k=3)

# train
classifier.fit(X_train, y_train)

# predict
predictions = classifier.predict(X_test)
于 2021-12-01T11:42:15.637 回答