php - REGEXP 在特殊字符上返回 false

Question

我不太擅长正则表达式，但希望有人能更好地向我解释，我在调试的代码中发现了这一点。我想知道为什么我在这种情况下总是出错。

我知道\p{L}匹配“字母”类别中的单个代码点。0-9是数字。

$regExp = /^\s*
     (?P([0-2]?[1-9]|[12]0|3[01]))\s+
     (?P\p{L}+?)\s+
     (?P[12]\d{3})\s*$/i;

    $value = '12 Février 2015' ;
    $matches = array();

    $match = preg_match($regExp, $value, $matches);

附加信息，我想出了这个：

$match = preg_match("/^\s*(?P<monthDay>([0-2]?[1-9]|[12]0|3[01]))\s+(?P<monthNameFull>\p{L}+?)\s+(?P<yearFull>[12]\d{3})\s*$/i", "18 Février 2015");
var_dump($match); //It will print int(0).

但如果值为18 February 2015，它将打印 int(1)。为什么呢？假设在两个值中都返回 1，因为\p{L}将接受 unicode 字符。

score 1 · Accepted Answer

$regExp = '/^\s*(?P<y>([0-2]?[1-9]|[12]0|3[01]))\s+(?P<m>\p{L}+?)\s+(?P<d>[12]\d{3})\s*$/usD';

$value = '12 Février 2015';
$matches = array();

$match = preg_match($regExp, $value, $matches);

var_dump($matches);

除非你想要一个错误<name>，否则你总是必须使用...并且通过 unicode 多行字符串，你需要标志。这很容易记住，它就像美元...(?PusD

score 0 · Accepted Answer

使用uunicode 修饰符：

$regExp = /^\s*
   (?P<monthDay>([0-2]?[1-9]|[12]0|3[01]))\s+
   (?P<monthNameFull>\p{L}+?)\s+
   (?P<yearFull>[12]\d{3})\s*$/u;
//                      here __^

i修饰符不是强制性的，不\p{L}区分大小写。

score 0 · Accepted Answer

想出一个修复方法，使用 /u 而不是 /i。

$match = preg_match("/^\s*(?P<monthDay>([0-2]?[1-9]|[12]0|3[01]))\s+(?P<monthNameFull>\p{L}+?)\s+(?P<yearFull>[12]\d{3})\s*$/u", "18 Février 2015");
var_dump($match); //It will print int(1).

谢谢大家的帮助

score 0 · Accepted Answer

不需要命名组，而且它们的语法似乎是错误的。所以这个清理后的版本应该可以工作：

/^ \s*([0-2]?[1-9]|[12]0|3[01])\s+ \p{L}+?\s+ [12]\d{3}\s* $/i

一个月中某一天的模式也更容易理解，因为：

(0?[1-9]|[12][0-9]|3[01])

php - REGEXP 在特殊字符上返回 false

4 回答 4

Related

Reference