12

我需要在文件中获取前一行的值,并将其与当前行进行比较,因为我正在遍历文件。该文件很大,因此我无法完整读取它或随机访问行号,linecache因为库函数仍然会将整个文件读入内存。

编辑我很抱歉我忘了提到我必须向后阅读文件。

编辑2

我尝试了以下方法:

 f = open("filename", "r")
 for line in reversed(f.readlines()): # this doesn't work because there are too many lines to read into memory

 line = linecache.getline("filename", num_line) # this also doesn't work due to the same problem above. 
4

3 回答 3

21

Just save the previous when you iterate to the next

prevLine = ""
for line in file:
    # do some work here
    prevLine = line

This will store the previous line in prevLine while you are looping

edit apparently OP needs to read this file backwards:

aaand after like an hour of research I failed multiple times to do it within memory constraints

Here you go Lim, that guy knows what he's doing, here is his best Idea:

General approach #2: Read the entire file, store position of lines

With this approach, you also read through the entire file once, but instead of storing the entire file (all the text) in memory, you only store the binary positions inside the file where each line started. You can store these positions in a similar data structure as the one storing the lines in the first approach.

Whever you want to read line X, you have to re-read the line from the file, starting at the position you stored for the start of that line.

Pros: Almost as easy to implement as the first approach Cons: can take a while to read large files

于 2013-06-28T20:46:02.497 回答
5

@Lim,这就是我的写法(回复评论)

def do_stuff_with_two_lines(previous_line, current_line):
    print "--------------"
    print previous_line
    print current_line

my_file = open('my_file.txt', 'r')

if my_file:
    current_line = my_file.readline()

for line in my_file:

    previous_line = current_line
    current_line = line

    do_stuff_with_two_lines(previous_line, current_line)
于 2013-06-28T20:56:19.937 回答
2

我会为这个任务写一个简单的生成器:

def pairwise(fname):
    with open(fname) as fin:
        prev = next(fin)
        for line in fin:
            yield prev,line
            prev = line

或者,您可以使用以下pairwise配方itertools

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = itertools.tee(iterable)
    next(b, None)
    return itertools.izip(a, b)
于 2013-06-28T20:29:49.900 回答