python - 输出总是给出最后一行或所有行

Question

我有两个制表符分隔的文件，我需要针对另一个文件中的所有行测试第一个文件中的每一行。例如，

文件1：

row1    c1    36    345   A
row2    c3    36    9949  B
row3    c4    36    858   C

文件2：

row1    c1    3455  3800
row2    c3    6784  7843
row3    c3    10564 99302
row4    c5    1405  1563

假设我想输出 (file1) 中的所有行，其中 file1 的 col[3] 小于 file2 的任何（不是每个）col[2]，因为 col[1] 是相同的。

预期输出：

row1    c1    36    345   A
row2    c3    36    9949  B

由于我在 Ubuntu 中工作，我希望输入命令如下所示：python code.py [file1] [file2] > [output]

我写了以下代码：

import sys

filename1 = sys.argv[1]
filename2 = sys.argv[2]

file1 = open(filename1, 'r')

done = False

for x in file1.readlines():
    col = x.strip().split()
    file2 = open(filename2, 'r')
    for y in file2.readlines():
        col2 = y.strip().split()
        if col[1] == col2[1] and col[3] < col2[2]:
            done = True
            break
        else: continue
print x

但是，输出如下所示：

row2    c3    36    9949  B

基本上我总是只得到嵌套循环中条件为真的最后一行。我尝试了这个：

    if done == True: print x

（有一个缩进），但现在它打印 file1 中的所有行，而不管在前一个循环中测试的条件如何。(>_<)

score 3 · Accepted Answer

您忘记在第一次匹配后重置done变量，在这种情况下您不需要该变量。要修复代码，只需替换done = True并print x使用int(col[3]) < int(col2[2])将列作为数字（整数）进行比较。

python - 输出总是给出最后一行或所有行

1 回答 1

Related

Reference