python - Python 检查列表中的单词

Question

我正在编写一个拼写检查功能，我正在使用两个文本文件：一个拼写错误的文本和一个包含字典中一堆单词的文本文件。我已将拼写错误单词的文本转换为字符串列表，并将包含字典单词的文本文件转换为单词列表。现在我需要查看拼写错误列表中的单词是否在我的字典单词列表中。

def spellCheck():
    checkFile=input('Enter file name: ')
    inFile=open(checkFile,'r')

# This separates my original text file into a list like this
# [['It','was','the','besst','of','times,'],
# ['it','was','teh','worst','of','times']]

    separate=[]
    for line in inFile:
        separate.append(line.split())

# This opens my list of words from the dictionary and 
# turns it into a list of the words.

    wordFile=open('words.txt','r')
    words=wordFile.read()
    wordList=(list(words.split()))
    wordFile.close()


# I need this newList to be a list of the correctly spelled words 
# in my separate[] list and if the word isn't spelled correctly 
# it will go into another if statement... 

    newList=[]
    for word in separate:
        if word in wordList:
            newList.append(word)
    return newList

score 3 · Accepted Answer

尝试这个：

newList = []
for line in separate:
    for word in line:
        if word in wordList:
            newList.append(word)
return newList

您遇到的问题是您正在迭代separate，这是一个列表列表。您的中不存在任何列表wordList，这就是 if 语句总是失败的原因。您要迭代的单词位于separate. 因此，您可以在第二个 for 循环中迭代这些单词。您也可以使用for word in itertools.chain.from_iterable(separate).

希望这可以帮助

score 1 · Accepted Answer

首先，关于数据结构。list您应该使用s而不是sets，因为您（显然）只想要每个单词的副本。您可以从列表中创建集合：

input_words = set(word for line in separate for word in line) # since it is a list of lists
correct_words = set(word_list)

那么，就这么简单：

new_list = input_words.intersection(correct_words)

如果你想要不正确的词，你还有另一个班轮：

incorrect = input_words.difference(correct_words)

请注意，我使用了 names_with_underscores，而不是像 PEP 8 中推荐的 CamelCase。但是请记住，这对于拼写检查不是很有效，因为您不检查上下文。

python - Python 检查列表中的单词

2 回答 2

Related

Reference