我正在尝试从文本文件中获取数据。文本文件中感兴趣的行是那些一直匹配“标记 2”直到最后一个“标记 3”实例的行。可以有多个标记(重复)。我想要'Marker 2'的最小行号和'Marker 3'的最大行号-该最小值/最大值内的所有文本。虽然这可行,但我想看看如何以 Python 的方式、更高效和更少的代码来做到这一点。
为什么我必须两次打开同一个文件?否则它给了我,xreadlines 和 readlines 冲突?
file_seeklines.py
import sys
filename = sys.argv[1]
line_number = []
number = 0
## Fetch the boundary(start, end points)
f = open(filename,'r')
for line in f.xreadlines():
number += 1
if "marker 2" in line.strip().lower():
line_number.append(number)
if "marker 3" in line.strip().lower():
line_number.append(number)
#print line_number[0], line_number[-1]
start, end = line_number[0]-1, line_number[-1]
f.close()
## Grab the boundary
g = open(filename,'r')
linelist = g.readlines()
try:
for i in xrange(start, end):
print linelist[i]
except:
print "failed"
pass
g.close()
文件.txt
Welcome notice
------------------------
Hello there, welcome! Foo
Marker 0
hello
world
Bar
Yes!
Foo
How are ya?!
Bar
Have a great day!
Marker 1
Hello 1 2
12
MarKer 2
Hello 23
23
Marker 3
Hello 34
34
marker 2
Hello 45
45
MArker 3
输出
MarKer 2
Hello 23
23
Marker 3
Hello 34
34
marker 2
Hello 45
45
MArker 3