python - Python：列出附加问题

Question

我有各种各样的逻辑错误，我似乎无法挑选出来。这是我所拥有的：

Document = 'Sample1'
locationslist = []
thedictionary = []
userword = ['the', 'a']
filename = 'Sample1'
for inneritem in userword:
     thedictionary.append((inneritem,locationslist))
     for position, item in enumerate(file_contents): 
        if item == inneritem:
            locationslist.append(position)
wordlist = (thedictionary, Document)
print wordlist

所以基本上我试图从一个较小的列表（locationslist）与特定的用户词一起创建一个更大的列表（字典）。我几乎拥有它，除了输出将所有单词的所有位置（其中只有 2 个 -'the'和'a'）放在每个列表中。似乎有一个简单的逻辑问题 - 但我似乎无法发现它。输出是：

([('the', [5, 28, 41, 97, 107, 113, 120, 138, 141, 161, 2, 49, 57, 131, 167, 189, 194, 207, 215, 224]), 
  ('a', [5, 28, 41, 97, 107, 113, 120, 138, 141, 161, 2, 49, 57, 131, 167, 189, 194, 207, 215, 224])], 
 'Sample1')

但应该是：

([('the', [5, 28, 41, 97, 107, 113, 120, 138, 141, 161]), 
  ('a', [2, 49, 57, 131, 167, 189, 194, 207, 215, 224])], 
 'Sample1')

看看这两个位置列表是如何被附加到每个关于每个用户词'the'和的有问题的输出中的'a'？我可以就我在这里做错的事情提出建议..

score 3 · Accepted Answer

你只创造一个locationslist，所以你只有一个。它由两个词共享。您需要locationslist在每次循环迭代中创建一个新的：

for inneritem in userword:
    locationslist = []
    thedictionary.append((inneritem,locationslist))
    # etc.

score 1 · Accepted Answer

您只创建了一个locationslist，因此所有locationslist.append()调用都会修改该列表。您将相同的元素附加locationslist到. 您应该为的每个元素创建一个位置列表。thedictionaryuserworduserword

您拥有的算法可以编写为一组嵌套的列表推导，这将导致创建正确的列表：

user_word = ['the', 'a']
word_list = ([(uw, 
               [position for position, item in enumerate(file_contents) 
                if item == uw]) 
               for uw in user_word], 
             'Sample1')

enumerate(file_contents)对于中的每个项目，这仍然会调用一次，如果很大user_word，这可能会很昂贵。file_contents

我建议您将其重写为传递file_contents一次，根据的内容检查每个位置的项目user_word，并将该位置仅附加到在该位置找到的特定 user_word 的列表中。我建议使用字典将 user_word 列表分开且可访问：

document = 'Sample1'

temp_dict = dict((uw, []) for uw in user_word)

for position, item in enumerate(file_contents):

if item in temp_dict:
    temp_dict[item].append(position)

wordlist = ([(uw, temp_dict[uw]) for uw in user_word], document)

任何一种解决方案都会按照出现的顺序为您提供每个 user_word 在正在扫描的文档中的位置。它还将返回您正在寻找的列表结构。

python - Python：列出附加问题

2 回答 2

Related

Reference