TL;博士:
>>> import nltk
>>> hypothesis = ['This', 'is', 'cat']
>>> reference = ['This', 'is', 'a', 'cat']
>>> references = [reference] # list of references for 1 sentence.
>>> list_of_references = [references] # list of references for all sentences in corpus.
>>> list_of_hypotheses = [hypothesis] # list of hypotheses that corresponds to list of references.
>>> nltk.translate.bleu_score.corpus_bleu(list_of_references, list_of_hypotheses)
0.6025286104785453
>>> nltk.translate.bleu_score.sentence_bleu(references, hypothesis)
0.6025286104785453
(注意:您必须在分支上拉取最新版本的 NLTKdevelop
才能获得稳定版本的 BLEU 分数实现)
在长:
实际上,如果整个语料库中只有一个参考和一个假设,那么两者corpus_bleu()
和sentence_bleu()
都应该返回相同的值,如上例所示。
在代码中,我们看到它sentence_bleu
实际上是一个鸭子类型corpus_bleu
:
def sentence_bleu(references, hypothesis, weights=(0.25, 0.25, 0.25, 0.25),
smoothing_function=None):
return corpus_bleu([references], [hypothesis], weights, smoothing_function)
如果我们查看以下参数sentence_bleu
:
def sentence_bleu(references, hypothesis, weights=(0.25, 0.25, 0.25, 0.25),
smoothing_function=None):
""""
:param references: reference sentences
:type references: list(list(str))
:param hypothesis: a hypothesis sentence
:type hypothesis: list(str)
:param weights: weights for unigrams, bigrams, trigrams and so on
:type weights: list(float)
:return: The sentence-level BLEU score.
:rtype: float
"""
的引用的输入sentence_bleu
是 a list(list(str))
。
因此,如果您有一个句子字符串,例如"This is a cat"
,您必须对其进行标记以获得字符串列表,["This", "is", "a", "cat"]
并且由于它允许多个引用,因此它必须是字符串列表列表,例如,如果您有第二个引用,“这是一只猫”,您的输入sentence_bleu()
将是:
references = [ ["This", "is", "a", "cat"], ["This", "is", "a", "feline"] ]
hypothesis = ["This", "is", "cat"]
sentence_bleu(references, hypothesis)
当涉及到corpus_bleu()
list_of_references 参数时,它基本上是一个包含任何sentence_bleu()
引用的列表:
def corpus_bleu(list_of_references, hypotheses, weights=(0.25, 0.25, 0.25, 0.25),
smoothing_function=None):
"""
:param references: a corpus of lists of reference sentences, w.r.t. hypotheses
:type references: list(list(list(str)))
:param hypotheses: a list of hypothesis sentences
:type hypotheses: list(list(str))
:param weights: weights for unigrams, bigrams, trigrams and so on
:type weights: list(float)
:return: The corpus-level BLEU score.
:rtype: float
"""
除了查看 .doctest 中的 doctest nltk/translate/bleu_score.py
,您还可以查看 unittest atnltk/test/unit/translate/test_bleu_score.py
以了解如何使用bleu_score.py
.
顺便说一句,由于是在 ( ]( https://github.com/nltk/nltk/blob/develop/nltk/translate/init .py #L21sentence_bleu
)中导入的,因此使用bleu
nltk.translate.__init__.py
from nltk.translate import bleu
将与以下内容相同:
from nltk.translate.bleu_score import sentence_bleu
在代码中:
>>> from nltk.translate import bleu
>>> from nltk.translate.bleu_score import sentence_bleu
>>> from nltk.translate.bleu_score import corpus_bleu
>>> bleu == sentence_bleu
True
>>> bleu == corpus_bleu
False