php - PHP：从大文本文件的末尾检索行

Question

我已经搜索了很长一段时间的答案，但没有找到任何可以正常工作的东西。

我有一些日志文件，其中一些100MB大小达到140,000文本行。使用PHP，我正在尝试获取500文件的最后几行。

我将如何获得500线路？对于大多数功能，文件被读入内存，这不是一个合理的情况。我最好远离执行系统命令。

score 6 · Accepted Answer

如果您使用的是“nix 机器”，您应该能够使用 shell 转义和工具“tail”。已经有一段时间了，但是是这样的：

$lastLines = `tail -n 500`;

请注意刻度线的使用，它以 BASH 或类似方式执行字符串并返回结果。

score 4 · Accepted Answer

我编写了这个函数，它对我来说似乎工作得很好。它返回一个行数组，就像file一样。如果您希望它返回类似file_get_contents的字符串，则只需将return语句更改为return implode('', array_reverse($lines));：

function file_get_tail($filename, $num_lines = 10){

    $file = fopen($filename, "r");

    fseek($file, -1, SEEK_END);

    for ($line = 0, $lines = array(); $line < $num_lines && false !== ($char = fgetc($file));) {
        if($char === "\n"){
            if(isset($lines[$line])){
                $lines[$line][] = $char;
                $lines[$line] = implode('', array_reverse($lines[$line]));
                $line++;
            }
        }else
            $lines[$line][] = $char;
        fseek($file, -2, SEEK_CUR);
    }
    fclose($file);

    if($line < $num_lines)
        $lines[$line] = implode('', array_reverse($lines[$line]));

    return array_reverse($lines);
}

例子：

file_get_tail('filename.txt', 500);

score 3 · Accepted Answer

如果您想在 PHP 中执行此操作：

<?php
/**
  Read last N lines from file.

  @param $filename string  path to file. must support seeking
  @param $n        int     number of lines to get.

  @return array            up to $n lines of text
*/
function tail($filename, $n)
{
  $buffer_size = 1024;

  $fp = fopen($filename, 'r');
  if (!$fp) return array();

  fseek($fp, 0, SEEK_END);
  $pos = ftell($fp);

  $input = '';
  $line_count = 0;

  while ($line_count < $n + 1)
  {
    // read the previous block of input
    $read_size = $pos >= $buffer_size ? $buffer_size : $pos;
    fseek($fp, $pos - $read_size, SEEK_SET);

    // prepend the current block, and count the new lines
    $input = fread($fp, $read_size).$input;
    $line_count = substr_count(ltrim($input), "\n");

    // if $pos is == 0 we are at start of file
    $pos -= $read_size;
    if (!$pos) break;
  }

  fclose($fp);

  // return the last 50 lines found  

  return array_slice(explode("\n", rtrim($input)), -$n);
}

var_dump(tail('/var/log/syslog', 50));

这在很大程度上未经测试，但应该足以让您获得一个完全有效的解决方案。

缓冲区大小为 1024，但可以更改为更大或更大。（您甚至可以根据 $n * 估计的行长动态设置它。）这应该比逐个字符地查找要好，尽管这确实意味着我们需要substr_count()寻找新的行。

php - PHP：从大文本文件的末尾检索行

3 回答 3

Related

Reference