1

我尝试将字符串中的两个部分与 PHP 中的正则表达式匹配。我认为贪婪有问题。我希望第一个正则表达式(见评论)给我前两个捕获,作为第二个正则表达式,但仍然捕获两个字符串。我究竟做错了什么?

我正在尝试获取+123(如果cd:存在,如在第一个字符串中)和456.

<?php

$data[] = 'longstring start waste cd:+123yz456z longstring';
$data[] = 'longstring start waste +yz456z longstring';
$regexs[] = '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/'; // first
$regexs[] = '/start[^z]*?(cd:([^y]+)y)[^z]*z([^z]*)z/';  // second

foreach ($regexs as $regex) {
  foreach ($data as $string) {
    if (preg_match($regex, $string, $match)) {
      echo "Tried '$regex' on '$string' and got " . implode(',', array_split($match, 1));
      echo "\n";
    }
  }
}
?>

输出是:

Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got ,,456
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste +yz456z longstring' and got ,,456
Tried '/start[^z]*?(cd:([^y]+)y)[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got cd:+123y,+123,456

没有第四行,因为cd:第二个字符串中不存在。

预期输出(因为我不是专家),第一行与实际输出不同:

Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got cd:+123y,+123,456
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste +yz456z longstring' and got ,,456
Tried '/start[^z]*?(cd:([^y]+)y)[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got cd:+123y,+123,456
4

1 回答 1

1

好的,所以你想捕捉+123是否存在cd:, 并且总是456?这是我的做法:

$data[] = 'longstring start waste cd:+123yz456z longstring';
$data[] = 'longstring start waste +yz456z longstring';

$regexs[] = '/start.+?(?:cd:(.+?)y)?.*?z(.+?)z/';

通过自由使用非贪心 ( ?) 乘数,您可以让它完全按照您的意愿行事。

还要注意(?:)非捕获组。它们非常有用。

编辑显然这不起作用,让我们尝试一种不同的方法,使用“非/或”组:

$regexs[] = '/start.+?(?:cd:(.+?)yz(.+?)z|\+yz(.+?)z)/';
于 2011-10-24T19:52:39.627 回答