php - php 用 ftell 解析 csv

Question

我有一个 500mb 的 csv 文件，其中有超过 500,000 行，每行有 80 个字段。我正在使用 fget 逐行处理文件。

$col1 = array();
while (($row = fgetcsv($handle, 1000, ",")) !== FALSE) {
  $col1[] = $row[0];
}

由于我的托管服务提供商对 PHP 文件的执行时间限制（120 秒），我无法一次性处理整个文件。

我尝试使用 ftell() 和 fseek() 来记住重新启动的最后位置。问题是，有时 ftell() 位置在一行的中间，而恢复意味着错过了行的前半部分。

有没有一种优雅的方法可以知道最后一行成功处理，并从它之后的那一行恢复？我意识到我可以做一个简单的计数器，然后再次循环到该点，但这会在我可以在文件末尾处理的行上产生递减的回报。

有没有像 ftell() 和 fseek() 这样的东西适用于我的情况？或者限制 ftell() 返回上一行结尾的指针的方法？

score 3 · Accepted Answer

When i needed to work with files that big I always use the 'divide and conquer' premise. For your case I would:

Dynamicaly create a folder

Copy this big file inside it

Split it (on linux split called from php) split command

use the shell_exec command in php

After split it, delete it (the big file)

Then loop through the files in the folder reading one by one.

And for every file I finish I delete it. So if the time limit occurs you will need just to continue reading the files left in the folder.

1 回答 1