0

我正在从两个文本文件中获取数据,并比较它们,如果 file1 中的数据也在 file2 中,那么它应该从 file1 中删除数据

import sys
File1 = open("file1.txt")
File2 = open("file2.txt")
for lines in File1:
    for line in File2:
        for lines in line:
            print lines
File1
you
you to
you too
why
toh

File2
you
you to

我的程序显示了 file2 中的单词,我如何从 file1 中删除 file2 中的条目?

4

4 回答 4

2

您可以使用该fileinput模块inplace=True并将第二个文件加载到一个set用于查找目的...

import fileinput

with open('file2.txt') as fin:
    exclude = set(line.rstrip() for line in fin)

for line in fileinput.input('file1.txt', inplace=True):
    if line.rstrip() not in exclude:
        print line,
于 2013-03-01T05:43:11.110 回答
1

你可以这样做:

file2 = open('file2.txt').readlines()
with open('result.txt', 'w') as result:
    for line in open('file1.txt'):
        if line not in file2:
            result.write(line)

它不会修改“file1.txt”,而是创建另一个文件“result.txt”,其中包含不在file2 中的file1 行。

于 2013-02-28T15:10:44.547 回答
1
import string

file1 = set(map(string.rstrip, open("f1").readlines()))
file2 = set(map(string.rstrip, open("f2").readlines()))

print ( file1 - file2 ) | file2

set(['other great lines of text', 'from a file', 'against.', 'keep text', 'which is wanted to compare', 'This is', 'some text', 'keep is wanted to compare'])

f1

This is 
keep text
from a file 
keep is wanted to compare
against.

f2

This is 
some text
from a file 
which is wanted to compare
against.
other great lines of text

保持秩序存在问题,这可能是问题,也可能不是问题。

于 2013-02-28T16:31:55.230 回答
1

如果 file2 适合内存;您可以set()用来避免O(n)查找每一行:

with open('file2.txt') as file2:
    entries = set(file2.read().splitlines())

with open('file1.txt') as file1, open('output.txt', 'w') as outfile:
    outfile.writelines(line for line in file1
                       if line.rstrip("\n") not in entries)
于 2013-03-01T05:33:24.233 回答