1

我有一组参考词(拼写正确),我需要一个用户输入的词。使用 levenshtein 距离将输入单词与参考列表进行比较,我需要从参考列表中返回成本最低的单词。此外,该参考列表按频率排序,因此较高的频率出现在顶部。如果两个词的距离相同,则返回频率较高的词。“NWORDS”是我按频率排序的参考列表。“候选人”是用户输入的词。

代码:

for word in NWORDS: #iterate over all words in ref
    i = jf.levenshtein_distance(candidate,word) #compute distance for each word with user input

        #dont know what to do here
    return word #function returns word from ref list with lowest dist and highest frequency of occurrence.
4

1 回答 1

2

您可以按如下方式处理:

match = None # best match word so far
dist = None # best match distance so far
for word in NWORDS: #iterate over all words in ref
    i = jf.levenshtein_distance(candidate, word) #compute distance for each word with user input
    if dist is None or i < dist: # or <= if lowest freq. first in NWORDS
        match, dist = word, i
return match #function returns word from ref list with lowest dist and highest frequency of occurrence
于 2014-02-16T10:07:58.993 回答