应评论者的要求发布我对类似问题的回答的答案,其中使用相同的技术来改变文件的最后一行,而不仅仅是获取它。
对于较大的文件,mmap这是执行此操作的最佳方法。为了改进现有mmap答案,此版本可在 Windows 和 Linux 之间移植,并且应该运行得更快(尽管如果不对 32 位 Python 与 GB 范围内的文件进行一些修改,它将无法工作,请参阅其他答案以获取有关处理此问题的提示,并用于修改以在 Python 2 上工作)。
import io  # Gets consistent version of open for both Py2.7 and Py3.x
import itertools
import mmap
def skip_back_lines(mm, numlines, startidx):
    '''Factored out to simplify handling of n and offset'''
    for _ in itertools.repeat(None, numlines):
        startidx = mm.rfind(b'\n', 0, startidx)
        if startidx < 0:
            break
    return startidx
def tail(f, n, offset=0):
    # Reopen file in binary mode
    with io.open(f.name, 'rb') as binf, mmap.mmap(binf.fileno(), 0, access=mmap.ACCESS_READ) as mm:
        # len(mm) - 1 handles files ending w/newline by getting the prior line
        startofline = skip_back_lines(mm, offset, len(mm) - 1)
        if startofline < 0:
            return []  # Offset lines consumed whole file, nothing to return
            # If using a generator function (yield-ing, see below),
            # this should be a plain return, no empty list
        endoflines = startofline + 1  # Slice end to omit offset lines
        # Find start of lines to capture (add 1 to move from newline to beginning of following line)
        startofline = skip_back_lines(mm, n, startofline) + 1
        # Passing True to splitlines makes it return the list of lines without
        # removing the trailing newline (if any), so list mimics f.readlines()
        return mm[startofline:endoflines].splitlines(True)
        # If Windows style \r\n newlines need to be normalized to \n, and input
        # is ASCII compatible, can normalize newlines with:
        # return mm[startofline:endoflines].replace(os.linesep.encode('ascii'), b'\n').splitlines(True)
这假设拖尾的行数足够小,您可以一次安全地将它们全部读入内存;您还可以将其设为生成器函数,并通过将最后一行替换为以下内容来手动读取一行:
        mm.seek(startofline)
        # Call mm.readline n times, or until EOF, whichever comes first
        # Python 3.2 and earlier:
        for line in itertools.islice(iter(mm.readline, b''), n):
            yield line
        # 3.3+:
        yield from itertools.islice(iter(mm.readline, b''), n)
最后,以二进制模式读取(必须使用mmap),因此它给str出行(Py2)和bytes行(Py3);如果您想要unicode(Py2)或str(Py3),可以调整迭代方法以为您解码和/或修复换行符:
        lines = itertools.islice(iter(mm.readline, b''), n)
        if f.encoding:  # Decode if the passed file was opened with a specific encoding
            lines = (line.decode(f.encoding) for line in lines)
        if 'b' not in f.mode:  # Fix line breaks if passed file opened in text mode
            lines = (line.replace(os.linesep, '\n') for line in lines)
        # Python 3.2 and earlier:
        for line in lines:
            yield line
        # 3.3+:
        yield from lines
注意:我在无法访问 Python 进行测试的机器上输入了这一切。如果我输入任何错误,请告诉我;这与我认为应该可以使用的其他答案非常相似,但是调整(例如处理 an )可能会导致细微的错误。如果有任何错误,请在评论中告诉我。offset