php - 如何检查两个可能的正则表达式？

Question

假设我想将某人的家庭住址解析为街道、门牌号、城市..

就我而言，有两种（非常不同的）可能的方式来格式化数据。所以我有两个很长的正则表达式要检查。如果正则表达式匹配，我想从这些正则表达式中导出数据。

1：

Long Square
25
London
...

2：

London
Living: Long Square, 25
....

我应该如何检查这两个？我是否应该只使用两个 if 子句并一一检查它们，例如：

if (preg_match(@$match_regex, file_get_contents($tag->getAttribute("src")), $matches) == true)
{
  //regex 1 matched
}
else if ((preg_match(@$match_regex_2, file_get_contents($tag->getAttribute("src")), $matches) 
{
  //regex 2 matched
}
else
{
  //no match
}

或者我应该在一个正则表达式中以某种方式检查？

喜欢：

[regex_1|regex_2]

哪种方法是首选并且cpu“更快”？

score 2 · Accepted Answer

最快的方法是搜索Living:文本，然后执行正则表达式：

$string = file_get_contents($tag->getAttribute("src"));
$matched = false;
$matches = array();

if (false === strpos($string, 'Living:')) {
    $matched = preg_match(@$match_regex, $string, $matches);
} else {
    $matched = preg_match(@$match_regex_2, $string, $matches);
}

if (!$matched) {
    // no match
} else {
    // print matches
}

请注意，我将这两种逻辑分开。第一个if块确定地址字符串的类型并执行正确的正则表达式。第二个if块检查是否发生匹配（无论执行了哪个正则表达式）。

score 1 · Accepted Answer

不要对性能做出假设——衡量它。

一个正则表达式是

(regex1)|(regex2)

当您拥有两个版本时，针对您的数据运行它们并测量时间。

php - 如何检查两个可能的正则表达式？

2 回答 2

Related

Reference