6

I'm writing a Python script to read a file, and when I arrive at a section of the file, the final way to read those lines in the section depends on information that's given also in that section. So I found here that I could use something like

fp = open('myfile')
last_pos = fp.tell()
line = fp.readline()
while line != '':
  if line == 'SPECIAL':
  fp.seek(last_pos)
  other_function(fp)
  break
last_pos = fp.tell()
line = fp.readline()

Yet, the structure of my current code is something like the following:

fh = open(filename)

# get generator function and attach None at the end to stop iteration
items = itertools.chain(((lino,line) for lino, line in enumerate(fh, start=1)), (None,))
item = True

  lino, line = next(items)

  # handle special section
  if line.startswith['SPECIAL']:

    start = fh.tell()

    for i in range(specialLines):
      lino, eline = next(items)
      # etc. get the special data I need here

    # try to set the pointer to start to reread the special section  
    fh.seek(start)

    # then reread the special section

But this approach gives the following error:

telling position disabled by next() call

Is there a way to prevent this?

4

2 回答 2

8

将文件用作迭代器(例如调用next()它或在for循环中使用它)使用内部缓冲区;实际文件读取位置在文件中更远,使用.tell()不会为您提供下一行的位置以产生。

如果需要来回查找,解决方法不是next()直接在文件对象上使用而是file.readline()仅使用。您仍然可以为此使用迭代器,使用两个参数版本iter()

fileobj = open(filename)
fh = iter(fileobj.readline, '')

调用将调用next(),直到该函数返回一个空字符串。实际上,这会创建一个使用内部缓冲区的文件迭代器。fileiterator()fileobj.readline()

演示:

>>> fh = open('example.txt')
>>> fhiter = iter(fh.readline, '')
>>> next(fhiter)
'foo spam eggs\n'
>>> fh.tell()
14
>>> fh.seek(0)
0
>>> next(fhiter)
'foo spam eggs\n'

请注意,您的enumerate链可以简化为:

items = itertools.chain(enumerate(fh, start=1), (None,))

尽管我不知道为什么您认为(None,)这里需要哨兵;StopIteration仍然会被提出,尽管next()稍后再叫一次。

要读取specialLines计数行,请使用itertools.islice()

for lino, eline in islice(items, specialLines):
    # etc. get the special data I need here

您可以直接循环fh而不是使用无限循环并next()在这里调用:

with open(filename) as fh:
    enumerated = enumerate(iter(fileobj.readline, ''), start=1):
    for lino, line in enumerated:
        # handle special section
        if line.startswith['SPECIAL']:
            start = fh.tell()

            for lino, eline in islice(items, specialLines):
                # etc. get the special data I need here

            fh.seek(start)

但请注意,即使您回溯,您的行号仍会增加!

但是,您可能希望重构代码以不需要重新读取文件的各个部分。

于 2014-03-27T13:09:44.627 回答
1

我不是 Python 3 版的专家,但您似乎正在使用从文件中读取的行进行阅读generatoryields因此,您只能有一个方向。

您将不得不使用另一种方法。

于 2014-03-27T13:12:50.940 回答