php - 两种模式之间的 PHP 正则表达式匹配

Question

我正在尝试解析一个包含大量跟踪的日志文件，其中一些跟踪有多行。

例子：

[trace-123] <request>This is a log line</request>
[trace-124] <reply>This is another log line

this is part of "[trace-124]" still.</reply>
[trace-125] <request>final log line.</request>

我正在尝试使用 preg_match_all 来获取所有痕迹的数组。

$file = file_get_contents("traces.txt");
$tracePattern = "/(\[trace-[0-9]*+\]+[\s\S]*)(?<=\<\/reply>|\<\/request>)/";

preg_match_all($tracePattern,$file,$lines);

echo "<pre>";print_r($lines);echo "</pre>";

理想情况下，我希望我的结果如下所示：

Array
(
    [0] => [trace-123] <request>This is a log line</request>
    [1] => [trace-124] <reply>This is another log line

this is part of "[trace-124]" still.</reply>
    [2] => [trace-125] <request>final log line.</request>
)

但是当我运行它时，我得到一个数组，其中包含数组的 1 个元素中的所有内容。当我写表达式时，我的目标基本上是寻找：

[trace-\[0-9]*\]

并找到从那场比赛到下一场比赛的所有内容。

我找到

\[trace-[0-9]*+\].*

效果很好，但是当有换行符时会崩溃。

score 3 · Accepted Answer

以下可能是这里更好的方法。

$results = preg_split('/\R(?=\[trace[^\]]*\])/', $text);
print_r($results);

看working demo

输出

Array
(
    [0] => [trace-123] <request>This is a log line</request>
    [1] => [trace-124] <reply>This is another log line

this is part of "[trace-124]" still.</reply>
    [2] => [trace-125] <request>final log line.</request>
)

score 2 · Accepted Answer

用这个：

$file = '[trace-123] <request>This is a log line</request>
[trace-124] <reply>This is another log line

this is part of "[trace-124]" still.</reply>
[trace-125] <request>final log line.</request>';

$tracePattern = "/\[trace-[0-9]*+\]+\s*<(?:reply|request)>.*?<\/(?:reply|request)>/s";

preg_match_all($tracePattern,$file,$lines);

$lines = $lines[0]; // by defaults, $lines[0] will be an array of the matches, so get that

echo "<pre>";print_r($lines);echo "</pre>";

工作演示：http: //ideone.com/n8n5r3

score 2 · Accepted Answer

我会推荐一个解决方案preg_split

preg_split('/\R+(?=\[trace-\d+])/', $str)

这导致以下结果

Array
(
    [0] => [trace-123] <request>This is a log line</request>
    [1] => [trace-124] <reply>This is another log line

this is part of "[trace-124]" still.</reply>
    [2] => [trace-125] <request>final log line.</request>
)

score 2 · Accepted Answer

这适用于 MULTI_LINE 模式。修剪前导空格和尾随换行符。

编辑：这假定一个锚点位于行[trace- ]的开头
或开头加上非换行符空格，直到“跟踪”。这是
唯一可辨别的记录分隔符。

 #  ^[^\S\n]*(\[trace-[^]]*\][^\n]*(?:(?!\s+\[trace-[^]]*\])\n[^\n]*)*)

 ^ [^\S\n]* 
 (
      \[trace- [^]]* \] [^\n]* 

      (?:
           (?! \s+ \[trace- [^]]* \] )
           \n [^\n]* 
      )*
 )

输出（单引号）

 '[trace-123] <request>This is a log line</request>'
 '[trace-124] <reply>This is another log line

 this is part of "[trace-124]" still.</reply>'
 '[trace-125] <request>final log line.</request>'

score 0 · Accepted Answer

该符号.表示除换行符之外的每个字符\n，您可以尝试用(.|\s)这种方式更改它：

#\[trace-[0-9]*+\](.|\s)*#

注意：您可以使用非捕获括号(?: )

更简单，添加标志“s”

#\[trace-[0-9]*+\].*#s

score 0 · Accepted Answer

您应该使用不情愿的量词（??,+?或*?）。

我相信这个正则表达式/(\[trace-[0-9]*\]\s*(?m:.*?)<\/(?:reply|request)>)/应该这样做......这(?m:.*?)部分是秘密。:)

score 0 · Accepted Answer

0

这应该与以下标志s有关：

(\[trace-[0-9]+\].*?<\/(?:reply|request)>)

现场演示

于 2013-11-14T20:33:24.910 回答

php - 两种模式之间的 PHP 正则表达式匹配

7 回答 7

Related

Reference