2

I'm trying to insert text at very specific locations in a text file. This text file can be fairly large (>> 10 GB)

The approach I am currently using to read it:

with open("my_text_file.txt") as f:
   while True:
     result = f.read(set_number_of_bytes)
     x = process_result(result)
     if x:
       replace_some_characters_that_i_just_read_and write_it_back_to_same_file

However, I am unsure as to how to implement

replace_some_characters_that_i_just_read_and write_it_back_to_same_file

Is there some method which I can use to determine where I have read up to in the current file that I might be able to use to write to the file.

Performance-wise, if I was to use the approach above to write to the original file at specific locations, would there be efficiency issues with having to find the write location before writing?

Or would you recommend creating an entirely different file and appending to that file on each loop above. Then deleting the original file after this operation is completed? Assuming space is not a large concern but performance is.

4

1 回答 1

4

使用在替换数据时正确处理文件的fileinput模块inplace,并设置标志:

import sys
import fileinput

for line in fileinput.input('my_text_file.txt', inplace=True):
    x = process_result(line)
    if x:
        line = line.replace('something', x)

    sys.stdout.write(line)

当您使用该inplace标志时,原始文件将移动到备份中,并且您写入的任何内容sys.stdout都会写入原始文件名(因此,作为新文件)。确保包括所有行,无论是否更改。

只要您的替换数据与您要替换的部分的字节数不完全相同,您必须重写整个文件。

于 2013-05-26T20:35:57.210 回答