0

为什么下面的正则表达式:$regex = '/\b(V|E)?\d{1,2}? ?\d{3} ?\d{3}\b/i'; 不匹配下面的所有输入

我确实认为这(V|E)?\d{1,2}? ?将使字母、第一个或两个数字和第一个空格成为可选

输入

<?php

$sms = array(
    'test test test 11 111 111 test test test',
    'test test test 1 111 111 test test test',
    'test test test 111 111 test test test', // does not match
    'test test test test test test 11111111',
    'test test test 1111111 test test test',
    'test test test 111111 test test test', // does not match
    'test test test E11 111 111 test test test',
    'test test test V1 111 111 test test test',
    'test test test V111 111 test test test', // does not match
    'test test test V11111111 test test test',
    'test test test V1111111 test test test',
    'test test test E111111 test test test', // does not match
    'test test test V 11 111 111 test test test',
    'test test test V 1 111 111 test test test',
    'test test test E 111 111 test test test', // does not match
    'test test test V 11111111 test test test',
    'test test test V 1111111 test test test',
    'test test test V 111111 test test test', //does not match
    'test test test V11 111 111 test test test',
    'test test test V1 111 111 test test test',
    'test test test E111 111 test test test', //does not match
    'test test test V11111111 test test test',
    'V1111111 test test test  test test test',
    'test test test V111111 test test test', // does not match
);

$regex = '/\b(V|E)?\d{1,2}? ?\d{3} ?\d{3}\b/i';
$noMatches = 0;
$index = 0;
foreach($sms as $v) {
    $match = preg_match($regex, $v, $matches);



    if($match) {
        //print_r($matches);
        //echo "$v match!\n";
        //$matches++;
    }
    else {
        echo "$index - $v does NOT match!\n";
        $noMatches++;
    }
    $index++;
}
$total = count($sms);
echo "\n\nTotal: $total\nNo Matches: $noMatches\n";

输出

$ php test-regex.php 
2 - test test test 111 111 test test test does NOT match!
5 - test test test 111111 test test test does NOT match!
8 - test test test V111 111 test test test does NOT match!
11 - test test test E111111 test test test does NOT match!
14 - test test test E 111 111 test test test does NOT match!
17 - test test test V 111111 test test test does NOT match!
20 - test test test E111 111 test test test does NOT match!
23 - test test test V111111 test test test does NOT match!


Total: 24
No Matches: 8

编辑:

使用马里奥建议正则表达式现在是$regex = '/\b(V|E)?\d{0,2} ?\d{3} ?\d{3}\b/i';,为什么在某些情况下,这个正则表达式不捕获字母VE

$output = array(
    'test test test E11 111 111 test test test' => 'E11 111 111',
    'test test test V1 111 111 test test test' => 'V1 111 111',
    'test test test V111 111 test test test' => 'V111 111',
    'test test test V11111111 test test test' => 'V11111111',
    'test test test V1111111 test test test' => 'V1111111',
    'test test test E111111 test test test' => 'E111111',
    'test test test V 11 111 111 test test test' => '11 111 111', // Missing Letter
    'test test test V 1 111 111 test test test' => '1 111 111', // Missing Leter
    'test test test E 111 111 test test test' => 'E 111 111',
    'test test test V 11111111 test test test' => '11111111', // Missing Letter
    'test test test V 1111111 test test test' => '1111111', // Missing Letter
    'test test test V 111111 test test test' => 'V 111111',
    'test test test V11 111 111 test test test' => 'V11 111 111',
    'test test test V1 111 111 test test test' => 'V1 111 111',
    'test test test E111 111 test test test' => 'E111 111',
    'test test test V11111111 test test test' => 'V11111111',
    'V1111111 test test test  test test test' => 'V1111111',
    'test test test V111111 test test test' => 'V111111',
    'V 1111111 test test test' => '1111111', // Missing Letter
    'test test test V 1111111 test test test' => '1111111', // Missing Letter
);
4

3 回答 3

2

?only 是组或文字字符或字符类之后的量词,例如

如果?发生在另一个量词之后,或者*它只会使匹配不那么贪婪。这意味着正则表达式将尝试匹配最少的数量。+{n,m}

所以\d{1,2}?并不意味着可选。这意味着匹配一两个,但更喜欢只匹配一个。你的意思是改写\d{0,2}

于 2013-01-16T23:19:41.903 回答
1

它们不匹配,因为正则表达式总共需要至少 7 位数字:

/\b(V|E)?\d{1,2}? ?\d{3} ?\d{3}\b/
             |        |      |
             |        |      \-------->  3 digits exactly
             |        \--------------->  3 digits exactly
             \------------------------>  1 or 2 digits (prefers 1, but will match
                                         2 if there are 8 digits in a row)

所有失败的输入都短一位。

于 2013-01-16T23:19:13.700 回答
1

如果您想让第一部分完全可选,则​​必须将其括在括号中并附加 a ?。您还可以使用字符组V|E

(?:[VE]\d{1,2} )?
于 2013-01-16T23:19:46.467 回答