1

是否有可能(以及如何)动态训练 sklearn MultinomialNB 分类器?每次我在其中输入电子邮件时,我都想训练(更新)我的垃圾邮件分类器。

我想要这个(不起作用):

x_train, x_test, y_train, y_test = tts(features, labels, test_size=0.2)
clf = MultinomialNB()
for i in range(len(x_train)):
    clf.fit([x_train[i]], [y_train[i]])
preds = clf.predict(x_test)

得到与此类似的结果(工作正常):

x_train, x_test, y_train, y_test = tts(features, labels, test_size=0.2)
clf = MultinomialNB()
clf.fit(x_train, y_train)
preds = clf.predict(x_test)
4

1 回答 1

2

Scikit-learn 支持多种算法的增量学习,包括 MultinomialNB。在此处查看文档

您需要使用方法partial_fit()而不是fit(),因此您的示例代码如下所示:

x_train, x_test, y_train, y_test = tts(features, labels, test_size=0.2)
clf = MultinomialNB()
for i in range(len(x_train)):
    if i == 0:
        clf.partial_fit([x_train[i]], [y_train[I]], classes=numpy.unique(y_train))
    else:
        clf.partial_fit([x_train[i]], [y_train[I]])
preds = clf.predict(x_test)

编辑:根据@BobWazowski 的建议,将classes参数添加到partial_fit

于 2020-05-26T14:13:12.597 回答