在 FastText 中,我想改变精确度和召回率之间的平衡。可以做到吗?
问问题
482 次
1 回答
1
如果您指的是python fasttext实现,恐怕没有内置的简单方法可以做到这一点,您可以做的是查看返回的概率并调用您选择的AUC或ROC曲线图方法概率列表,这是一个代码示例,它对二进制分类器执行此操作:
# label the data
labels, probabilities = fasttext_classifier.predict([re.sub('\n', ' ', sentence)
for sentence in test_sentences])
# convert fasttext multilabel results to a binary classifier (probability of TRUE)
labels = list(map(lambda x: x == ['__label__TRUE'], labels))
probabilities = [probability[0] if label else (1-probability[0])
for label, probability in zip(labels, probabilities)]
然后您可以使用常见的 sklearn 方法自由地构建您的指标:
from sklearn.metrics import roc_curve
from sklearn.metrics import roc_auc_score
from sklearn.metrics import precision_recall_curve
from sklearn.metrics import f1_score
from sklearn.metrics import auc
from matplotlib import pyplot
auc = roc_auc_score(testy, probabilities)
print('ROC AUC=%.3f' % (auc))
# calculate roc curve
fpr, tpr, _ = roc_curve(testy, probabilities)
# plot the roc curve for the model
pyplot.plot(fpr, tpr, marker='.', label='ROC curve')
# axis labels
pyplot.xlabel('False Positive Rate (sensitivity)')
pyplot.ylabel('True Positive Rate (specificity)')
# show the legend
pyplot.legend()
# show the plot
pyplot.show()
precision_values, recall_values, _ = precision_recall_curve(testy, probabilities)
f1 = f1_score(testy, labels)
# summarize scores
print('f1=%.3f auc=%.3f' % (f1, auc))
# plot the precision-recall curves
pyplot.plot(recall_values, precision_values, marker='.', label='Precision,Recall')
# axis labels
pyplot.xlabel('Recall')
pyplot.ylabel('Precision')
# show the legend
pyplot.legend()
# show the plot
pyplot.show()
命令行 fasttext 版本有一个阈值参数,您可以使用不同的阈值执行多次运行,但这非常耗时。
于 2021-01-17T05:04:01.760 回答