php - 正则表达式：选择多个组的问题

Question

我需要从以下文本中创建 3 个组：

[startA]
this is the first group
 [startB]
 blabla
[end]
[end]
[startA]
this is the second group
 [startB]
 blabla
[end]
[end]
[startA]
this is the second group
 [startB]
 blabla
[end]
[end]

如您所见，每个组都以开头[startA]和结尾[end]，制作与此匹配的正则表达式应该很容易。
但问题是在一个组内，字符串[end]被使用了任意次数。正则表达式应该匹配一个在 next 之前以 right
开头[startA]和结尾的组，而不是 previous 。[end][startA][end]

我认为应该通过前瞻来完成，但到目前为止我的尝试都没有奏效。
是否可以使用正则表达式来做到这一点？

score 1 · Accepted Answer

您应该使用递归正则表达式模式

preg_match_all('/\[(?!end)[^[\]]+\](?:[^[\]]*|[^[\]]*(?R)[^[\]]*)\[end\]\s*/', $s, $m);

请参阅此演示。

score 0 · Accepted Answer

是的，您确实可以通过前瞻来解决这个问题：

$test_string = <<<TEST
[startA]
this is the first group
 [startB]
 blabla
[end]
[end]
[startA]
this is the second group
 [startB]
 blabla
[end]
[end]
[startA]
this is the third group
 [startB]
 blabla
[end]
[end]
TEST;
preg_match_all('#\[startA](.+?)\[end]\s*(?=\[startA]|$)#s', 
    $test_string, $matches);
var_dump($matches[1]);

这是ideone 演示。

关键是在前瞻子模式中使用交替来测试下[startA]一部分或字符串的结尾（$）。

注意/s修饰符：没有它，.元字符将不匹配结束线（“\n”）。

php - 正则表达式：选择多个组的问题

2 回答 2

Related

Reference