python - 从另一个文本文件中搜索一个文件中列出的字符串？

Question

我想在另一个文本文件中查找 list.txt 中列出的字符串（每行一个字符串），以防我发现它打印 'string,one_sentence' 以防万一没有找到 'string,another_sentence'。我正在使用以下代码，但它只从文件 list.txt 中找到字符串列表中的最后一个字符串。无法理解可能是什么原因？

data = open('c:/tmp/textfile.TXT').read()
for x in open('c:/tmp/list.txt').readlines():
    if x in data:
        print(x,',one_sentence')
    else:
        print(x,',another_sentence')

score 5 · Accepted Answer

当您使用读取文件时readlines()，生成的列表元素确实有一个尾随换行符。很可能，这就是您的匹配项少于预期的原因。

而不是写

for x in list:

写

for x in (s.strip() for s in list):

这会从list. 因此，它会从字符串中删除尾随的换行符。

为了巩固您的程序，您可以执行以下操作：

with open('c:/tmp/textfile.TXT') as f:
    haystack = f.read()

if not haystack:
    sys.exit("Could not read haystack data :-(")

with open('c:/tmp/list.txt') as f:
    for needle in (line.strip() for line in f):
        if needle in haystack:
            print(needle, ',one_sentence')
        else:
            print(needle, ',another_sentence')

我不想做太剧烈的改变。最重要的区别是我在这里通过with语句使用上下文管理器。它确保为您正确处理文件（主要是关闭）。此外，使用生成器表达式动态剥离“针”线。上述方法是逐行读取和处理针文件，而不是一次将整个文件加载到内存中。当然，这只对大文件有影响。

score 0 · Accepted Answer

readlines() 在从列表文件读取的每个字符串的末尾保留一个换行符。在这些字符串上调用 strip() 以删除那些（以及所有其他空格）字符。

python - 从另一个文本文件中搜索一个文件中列出的字符串？

2 回答 2

Related

Reference