python - Python 读取大型文本文件（几 GB）的最快方法

Question

我有一个大文本文件（~7 GB）。我正在寻找是否存在读取大文本文件的最快方法。我一直在阅读有关使用几种方法逐块读取以加快该过程的信息。

例如effbot建议

# File: readline-example-3.py

file = open("sample.txt")

while 1:
    lines = file.readlines(100000)
    if not lines:
        break
    for line in lines:
        pass # do something**strong text**

为了每秒处理 96,900 行文本。其他作者建议使用 islice()

from itertools import islice

with open(...) as f:
    while True:
        next_n_lines = list(islice(f, n))
        if not next_n_lines:
            break
        # process next_n_lines

list(islice(f, n))将返回文件下一n行的列表f。在循环中使用它将为您提供成块的n文件

score 16 · Accepted Answer

with open(<FILE>) as FileObj:
    for lines in FileObj:
        print lines # or do some other thing with the line...

将一次读取一行到内存，并在完成后关闭文件...

python - Python 读取大型文本文件（几 GB）的最快方法

1 回答 1

Related

Reference