php - 将多个正则表达式（匹配和替换）组合成一个正则表达式；优化速度

Question

我有一个 PHP preg_match 函数实现，我将一个已知的 RegEx 与另一个变量的清理版本进行比较。我正在使用多个 preg_replace 等命令进行清理。我想知道是否有另一种方法来执行相同的逻辑，它更小（可能只涉及一个 reg 匹配）和更快（匹配几次比只做一次更复杂）。

这是我当前的代码：

$url_regex_to_match = /SOME_REGEX/; //I will pick this from DB

$matches = array();

//Following to replace http://www.google.com into http://google.com
preg_match('/(http.?):\/\/(www\.)?(.*)/i', $url, $matches);
if(sizeof($matches)==4) {
    $url = $matches[1]."://".$matches[3]; 
}
//Incase the preg_match is false (http is missing), we still need to remove www.
$url = preg_replace("/(^\*?|\/\/)www\./i","$1",$url);

//It converts google.com/a#mno into google.com/a
$url = preg_replace('/^(.*)(#.*)$/', '$1', $url);
//It converts pages like google.com/index.htm into google.com/
$url = preg_replace('/^(.*\/)((home|default|index)\..{3,4})(\?.*)*$/', '$1$4', $url);
//This will replace google.com/ into google.com
if(substr($url, -1) == "/") {
    $url = substr($url, 0, -1);
}

//This is just to match the new URLs with the pattern I have
$boolean = preg_match($url_regex_to_match , $url);

Boolean 的期望值当然是真/假。

谢谢

score 0 · Accepted Answer

我想知道你到底想要什么。我的意思是提取域可以在一个新的正则表达式中完成，如下所示：

preg_replace/http[s]*:\/\/[\w\d\.-]*\.([\d\w-]*)\..+\/(.*)/i,"$1")

所以基本上我的回答是：为您的问题构建一个正则表达式而不是多个。我看不出还有什么办法，因为另一种方式基本上需要计算机理解正则表达式搜索的内容并将其放在一起（这很可能会导致正则表达式变慢）。如果我的解决方案对您没有帮助，请在评论中告诉我。

编辑：对不起，我澄清了我的正则表达式。

php - 将多个正则表达式（匹配和替换）组合成一个正则表达式；优化速度

1 回答 1

Related

Reference