0

我有一个大文本文件(144000 行),其自定义格式如下:

xxx
XXXfield1XXX
value1
xxx
xxx
XXXfield2XXX
value2
xxx
xxx
XXXfield3XXX
value3
xxx

但是文件中存在语法错误(可能更多)(因为文件的总行数不可分为四)

如何仅使用 RegExp找到错误的行号?

4

2 回答 2

1

检测错误很容易..想象一下

日志.txt

xxx
XXXfield1XXX
value1
xxx
xxx
XXXfield2XXX <----- Note that this field has no value 
xxx
xxx
XXXfield3XXX
value3
xxx
value3
xxx

简单的扫描仪

$fileSource = "log.txt";
$tagRow = "xxx";
$tagField = "XXX";

$rh = fopen($fileSource, 'rb');
if (!$rh) {
    trigger_error("Can't Start File Resource");
}
echo "<pre>";
$i = 0;
while ( ! feof($rh) ) {
    $l = trim(fgets($rh));
    if ((($i % 4) == 0 || ($i % 4) == 3) && $l != $tagRow) {
        echo "Row tag error line $i \n";
        break;
    }

    if (($i % 4) == 1 && strpos($l, $tagField) !== 0) {
        echo "Missing Field tag line $i  \n";
        break;
    }

    if (($i % 4) == 2 && (strpos($l, $tagRow) === 0 || strpos($l, $tagRow) === 0)) {
        echo "Fixed Missing Value line $i \n";
        break;
    }
    $i ++;
}
fclose($rh);

输出

  Fixed Missing Value line 6 
于 2012-11-04T16:25:01.137 回答
0

编写一个程序来读取文件,一次一行,然后解析它。如果一行与格式不一致,则报错并退出。

在阅读每一行时,请跟踪行号。%使用运算符和 switch 语句基于行号进行测试。

switch ($linecount % 4) {
    case 0:
        $error = (some condition that evaluates the line);
        break;
    case 1:
        $error = (some condition that evaluates the line);
        break;
    case 2:
        $error = (some condition that evaluates the line);
        break;
    case 3:
        $error = (some condition that evaluates the line);
        break;
}
if ($error) {
    echo 'Error on line ' . $linenum . ': ' . $line;
    exit;
}
于 2012-11-02T17:07:43.700 回答