php - PHP - 复杂的正则表达式提取

Question

我有一些字符串要解析，它变得有点复杂了。

<?php
$notecomments = '
This is the first of the notes, and so whatever comes later is appended.<br>
(<b>John Smith</b>) at <b class="datetimeGMT">2012-02-07 00:00:20 GMT</b><hr>This is a comment posted<br><br>(<b>Alex Boom</b>) at <b class="datetimeGMT">2013-02-07 00:08:06 GMT</b><hr>And let's put some more in here<br />with a new line.';

if(preg_match_all('/\(<b>(?:(?!\(<b>).)*/s', $notecomments, $matches)){
print_r($matches);
}

/* result of code:
Array
(
    [0] => Array
        (
            [0] => (<b>John Smith</b>) at <b class="datetimeGMT">2012-02-07 00:00:20 GMT</b><hr>This is a comment posted<br><br>
            [1] => (<b>Alex Boom</b>) at <b class="datetimeGMT">2013-02-07 00:08:06 GMT</b><hr>And let's put some more in here<br />with a new line.
        )

)
*/
?>

我可以循环浏览“附加”注释，因为我在preg_match_all正则表达式规则中有要使用的指标。

但是，我的许多笔记在我的preg_match_all. （在这种情况下：“这是第一个注释，所以后面的内容都会被附加。
”）

我的第一个目标实现了。 这是我上面的代码的结果。我正在提取第一个注释的附加注释。

我的下一个目标是在第一次迭代之前检测到任何东西。这就是我卡住的地方。（在我上面的正则表达式语句中，在第一次迭代之前检测到任何东西）

score 0 · Accepted Answer

我为此使用带有两个正则表达式的 preg_replace_callback

 $notecomments = "This is the first of the notes, and so whatever comes later is appended.<br>(<b>John Smith</b>) at <b class=\"datetimeGMT\">2012-02-07 00:00:20 GMT</b><hr>This is a comment posted<br><br>(<b>Alex Boom</b>) at <b class=\"datetimeGMT\">2013-02-07 00:08:06 GMT</b><hr>And let's put some more in here<br />with a new line.";
 $output=preg_replace_callback(array("~<b (.*?)>(.+?)</b>~si","~<b>(.+?)</b>~si"),function($matches){
if(isset($matches[2])){
  print_r($matches[2]."\n");
}else{
  print_r($matches[1]."\n");
}
return '';},' '.$notecomments.' ');

输出：

 2012-02-07 00:00:20 GMT
 2013-02-07 00:08:06 GMT
 John Smith
 Alex Boom

php - PHP - 复杂的正则表达式提取

1 回答 1

Related

Reference