0

I'm trying to scrape some information out of a large text file and ran into a bit of a problem.

There are several items in this file that have the start_needle as an ID, but they are out of order. The end_needle is the string that denotes the end of the item. I am able to get the starting point, but how would I pull the line where the next immediate instance of the end_needle occurs?

Basically, "find the next instance of end_needle after start_needle"

start_needle = '725160001'
end_needle = '* * END ITEM * *'

filename = 'LAS3300Combined.txt'
target = open('file.txt', 'w')

start_list = []

with open(filename) as myFile:
    for num, line in enumerate(myFile, 1):
        if start_needle in line:
            start_list.append(num)
4

1 回答 1

0

Toggle boolean flags when start and finish:

start_list = []
end_list = []
started = False

with open(filename) as myFile:
    for num, line in enumerate(myFile, 1):
        if not started and start_needle in line:
            start_list.append(num)
            started = True
        if started and line.endswith(end_needle):
            end_list.append(num)
            started = False
于 2013-10-09T01:08:34.123 回答