3

我的问题是是否可以改进此代码,以便可以更快地在整个 word_list.txt 文件中搜索我定义的单词列表中的单词。有人告诉我,有一种方法可以通过将文件放入适当的数据结构中对所有 14 个单词进行一次迭代来完成此操作。

word_list = ['serve','rival','lovely','caveat','devote',\
         'irving','livery','selves','latvian','saviour',\
         'observe','octavian','dovetail','Levantine']

def sorted_word(word):
    """This return the sorted word"""
    list_chars = list(word)
    list_chars.sort()
    word_sort = ''.join(list_chars)
    return word_sort

print("Please wait for a few moment...")
print()

#Create a empty dictionary to store our word and the anagrams
dictionary = {}
for words in word_list:
    value = [] #Create an empty list for values for the key
    individual_word_string = words.lower()

    for word in open ('word_list.txt'):
        word1 = word.strip().lower() #Use for comparing

        #When sorted words are the same, update the dictionary        
        if sorted_word(individual_word_string) == sorted_word(word1):
            if word1[0] == 'v':
                value.append(word.strip()) #Print original word in word_list
                tempDict = {individual_word_string:value}
                dictionary.update(tempDict)

#Print dictionary
for key,value in dictionary.items():
    print("{:<10} = {:<}".format(key,value))

由于新用户限制,我无法发布我的结果图片。顺便说一句,结果应该为每个单词打印出以 v 开头的字谜。将很高兴为改进此代码提供任何帮助。

4

1 回答 1

0

如果您有足够的内存,您可以尝试将值存储到字典中,然后对其执行哈希搜索(非常快)。这样做的好处是您可以腌制它以在将来再次使用它(创建字典的过程很慢,查找速度很快)。如果你有非常大的数据集,你可能想使用 map reduce,我建议 disco-project 是一个不错的 python/erlang 框架。

word_list = ['serve','rival','lovely','caveat','devote',\
         'irving','livery','selves','latvian','saviour',\
         'observe','octavian','dovetail','Levantine']

print("Please wait for a few moment...")
print()

anagrams = {}

for word in open ('word_list.txt'):
    word = word.strip().lower() #Use for comparing
    key = tuple(sorted(word))
    anagrams[key] = anagrams.get(key,[]) + [word]

for word in word_list:
    print "%s -> %s" % (word.lower(),aragrams[tuple(sorted(word.lower()))])
于 2012-10-19T10:48:16.837 回答