4

我正在查看一些代码并开始考虑使用preg_replace.

首先 - 我意识到首先使用preg_replace这个任务可能是过度的,它可能不必要地昂贵,并且最好使用 PHP 的字符串友好函数来处理它,例如substr. 我知道这一点。

也就是说,考虑这两个不同的正则表达式:

$uri = '/one/cool/uri';    // Desired result '/one/cool'

// Using a back-reference
$parent = preg_replace('#(.*)/.*#', "$1", $uri);

// Using character class negation
$parent = preg_replace('#/[^/]+$#', '', $uri);

默认情况下,我会假设在前一种情况下,创建反向引用会比不这样做更昂贵,因此第二个示例会更可取。但后来我开始想知道[^/]在第二个示例中使用是否可能比在第一个示例中的相应更昂贵.,如果是,要多多少?

从可读性的角度来看,我更喜欢第一个示例,并且由于我们分心,我倾向于在两者之间进行选择(毕竟,编写可读代码也很有价值)。可能只是我个人的喜好。

想法?

4

1 回答 1

2

I also would measure running time of both options. This information from the docs may help too:

http://www.php.net/manual/en/regexp.reference.performance.php

If you are using such a pattern with subject strings that do not contain newlines, the best performance is obtained by setting PCRE_DOTALL, or starting the pattern with ^.* to indicate explicit anchoring. That saves PCRE from having to scan along the subject looking for a newline to restart at.

So, $parent = preg_replace('#^(.*)/.*#s', "$1", $uri); may speed the first option. The second one would not need this setup:

s (PCRE_DOTALL)

If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.

于 2012-11-30T23:28:09.727 回答