我在 Python 中工作,我决定分解并制作大量短语,以便与语音识别模块的结果进行比较。到目前为止,我有:

phrases = [
    "what time is it",
    "what's the weather",
    "what's the date",
    "what's up",
    "how are you"


def match(phrase):
    #match_greatest will start at zero but continuously update if the string
    #being compared has a higher percentage match
    match_greatest = 0

    #match will store the actual string that is closest
    match = ""

    for i in phrases:
        #this is the part I need help with...
        match_current = #somehow get the percentage that the argument phrase matches the phrase it's comparing to

        #if the current phrase is a closer match than before, update it
        if match_current > match_greatest:
            match_greatest = match_current
            match = i

    return match

...举个例子,如果我调用 match("what time it a") 或 match("what time sat") - 这些是语音识别可能给出的误读示例 - 并使用我当前的设置短语,它将返回“现在几点”。


如果您想要面向语音的距离,值得考虑soundex,它是 Levenshtein 的一个特定扩展,用于解释单词的语音属性。看



Here's an example of how I would do it.

def match(phrase):
    phrases = [
    "what time is it",
    "what's the weather",
    "what's the date",
    "what's up",
    "how are you"

    match_word_dict = {}
    for element in phrases:
        sameness = 0
        for index in range(len(element)):
            if len(phrase) == index:
            if phrase[index] == element[index]:
                sameness += 1

        percent = (sameness * 1.0 / len(element) * 1.0) * 100
        match_word_dict[element] = percent
    return match_word_dict

print match("hello")
print match("hel")

Where I return a dictionary that shows the phrase and percent match Also here's how I would go about only printing the phrase with the highest percent match

key, value = max(match("hello").iteritems(), key=lambda x:x[1])
print key, value 
phrases = {
    1: "what time is it",
    2: "what's the weather",
    3: "what's the date",
    4: "hello",
    5: "hi",
    6: "what's up",
    7: "how are you"

def match(phrase):
    phr_list = phrase.split()
    max_count = 0
    key = None

    for k, v in phrases.iteritems():
        count = sum(1 for word in phr_list if word.lower() in v.split())

        if count > max_count:
            count = max_count
            key = k

    if key:
        return phrases.get(key)
    return phrase

print match("what time it a")

print match("what time sit")    

print match(" how you good")


what time is it
what time is it
how are you
