4

我的正则表达式如下:

(?<![\s]*?(\"|&quot;)")WORD(?![\s]*?(\"|&quot;))

如您所见,我试图匹配 WORD 的所有实例,除非它们在“引号”内。所以...

WORD <- Find this
"WORD" <- Don't find this
"   WORD   " <- Also don't find this, even though not touching against marks
&quot;WORD&quot;  <- Dont find this (I check &quot; and " so works after htmlspecialchars)

如果我没有收到错误,我相信我的正则表达式会完美运行:

Compilation failed: lookbehind assertion is not fixed length

考虑到后视的局限性,有什么办法可以做我想做的事吗?

如果您能想到任何其他方法,请告诉我。

非常感谢,

马修

ps WORD 部分实际上将包含 Jon Grubers URL 检测器

4

2 回答 2

3

我会建议一种不同的方法。只要引号正确平衡,这将起作用,因为如果后面的引号数量是奇数,那么您就知道您在带引号的字符串中,从而使后视部分变得不必要:

if (preg_match(
'/WORD             # Match WORD
(?!                # unless it\'s possible to match the following here:
 (?:               # a string of characters
  (?!&quot;)       # that contains neither &quot;
  [^"]             # nor "
 )*                # (any length),
 ("|&quot;)        # followed by either " or &quot; (remember which in \1)
 (?:               # Then match
  (?:(?!\1).)*\1   # any string except our quote char(s), followed by that quote char(s)
  (?:(?!\1).)*\1   # twice,
 )*                # repeated any number of times --> even number
 (?:(?!\1).)*      # followed only by strings that don\'t contain our quote char(s)
 $                 # until the end of the string
)                  # End of lookahead/sx', 
$subject))
于 2013-06-25T15:01:06.057 回答
1

我建议删除带引号的字符串,然后搜索剩下的内容。

$noSubs = preg_replace('/(["\']|&quot;)(\\\\\1|(?!\1).)*\1/', '', $target);
$n = preg_match_all('/\bWORD\b/', $noSubs, $matches);

我用来替换上面引用的字符串的正则表达式将&quote;,"'作为单独的字符串分隔符。对于任何给定的分隔符,您的正则表达式看起来更像这样:

/"(\\"|[^"])*"/

因此,如果您想将其&quot;视为等同于"

/("|&quot;)(\\("|&quot;)|(?!&quot;)[^"])*("|&quot;)/i

如果您还想处理单引号字符串(假设没有带撇号的单词):

/("|&quot;)(\\("|&quot;)|(?!&quot;)[^"])*("|&quot;)|'(\\'|[^'])*'/i

转义这些以放入 PHP 字符串时要小心。

编辑

Qtax 提到您可能正在尝试替换匹配的 WORD 数据。在这种情况下,您可以使用如下正则表达式轻松标记字符串:

/("|&quot;)(\\("|&quot;)|(?!&quot;)[^"])*("|&quot;)|((?!"|&quot;).)+/i

进入带引号的字符串和不带引号的段,然后用您的替换构建一个新字符串,只对不带引号的部分进行操作:

$tokenizer = '/("|&quot;)(\\\\("|&quot;)|(?!&quot;)[^"])*("|&quot;)|((?!"|&quot;).)+/i';
$hasQuote = '/"|&quot;/i';
$word = '/\bWORD\b/';
$replacement = 'REPLACEMENT';
$n = preg_match_all($tokenizer, $target, $matches, PREG_SET_ORDER);
$newStr = '';
if ($n === false) {
    /* Print error Message */
    die();
}
foreach($matches as $match){
    if(preg_match($hasQuote, $match[0])){
        //If it has a quote, it's a quoted string.
        $newStr .= $match[0];
    } else {
        //Otherwise, run the replace.
        $newStr .= preg_replace($word, $replacement, $match[0]);
    }
}

//Now $newStr has your replaced String.  Return it from your function, or print it to
//your page.
于 2013-06-25T15:10:33.787 回答