我有一个文本文件:
>E8|E2|E9D
Football is a good game
Its good for health
you can play it every day
>E8|E2|E10D
Sequence unavailable
>E8|E2|EKB
Cricket
>E87|E77|E10D
Sequence unavailable
>E27|E97|E10D
Sequence unavailable
>E8|E2|E9D
Sequence unavailable
我编写了以下代码用于Sequence unavailable
从该文件中检测并删除它:
with open('input.txt') as f1, open('output.txt', 'w') as f2,\
open('temp_file','w') as f3:
lines = [] # store lines between two `>` in this list
for line in f1:
if line.startswith('>'):
if lines:
f3.writelines(lines)
lines = [line]
else:
lines.append(line)
elif line.rstrip('\n') == 'Sequence unavailable':
f2.writelines(lines + [line])
lines = []
else:
lines.append(line)
f3.writelines(lines)
os.remove('input.txt')
os.rename('temp_file', 'input.txt')
但我真正想要的是删除给定问题的所有可用序列(行的最后一列>
)。
例如,即使后面有行E9D
,如果有另一个没有条目的E9D
条目 ,Sequence unavailable
则应将其写入输出文件:
输入.txt
>E8|E2|E9D
Football is a good game
Its good for health
you can play it every day
>E8|E2|E10D
Sequence unavailable
>E8|E2|EKB
Cricket
>E87|E77|E10D
Sequence unavailable
>E27|E97|E10D
Sequence unavailable
>E8|E2|E9D
Sequence unavailable
输出.txt
>E8|E2|EKB
Cricket
这里只有EKB
问题有条目。