1

我想用链接替换词组。

词组被定义在一个多维数组中。将有数千个术语要替换,因此需要一个无索引、轻量级和多维的数组。

当术语后跟方括号或方括号内时,不应替换任何内容。

问题:正则表达式本身可以正常工作,但是当单词组包含正则表达式语法字符(如 + ? / ( 等等。所以我需要屏蔽它们。我尝试了所有我能想到的变体,但它不适用于所有情况。我不能用 $text 或 $s 屏蔽它们。

<?php

$text = "<html><body><pre>
Replace all foo / bar / baz cases here:
Case 1: Text Foo text.
Case 2: Text 'Foo' Bar text Foo.
Case 3: Text Foobar (2) text.
Case 4: Text Bar & Baz.
Case 5: Text Bar Baz?
Case 6: Text Bar? & Baz?
Case 7: Text Bar-X.

Replace nothing here (text followed by brackets) or [inside square brackets]: 
Case 1: Text Foo (text).
Case 2: Text 'Foo' Bar (text) Foo (text).
Case 3: Text Foobar (2) (text).
Case 4: Text Bar & Baz (text).
Case 5: Text Bar Baz (text).
Case 6: Text Bar? & Baz (text).
Case 7: Text Bar-X (text).
Case 8: [Text Foo]
</pre></body></html>";

$s = array(
  array("t" => "Foo",         "u" => "http://www.foo.net"),
  array("t" => "'Foo' Bar",   "u" => "http://www.foo.net"),
  array("t" => "Foobar (2)",  "u" => "http://www.foo.net"),
  array("t" => "Bar & Baz",   "u" => "http://www.foo.net"),
  array("t" => "Bar Baz?",    "u" => "http://www.foo.net"),
  array("t" => "Bar? & Baz?", "u" => "http://www.foo.net"),
  array("t" => "Bar-X",       "u" => "http://www.foo.net")
 );

$replaced = $text;
foreach ($s as $i => $row) {
# $replaced = preg_replace('/(?='.preg_quote($row["t"]).'[^\]][^(]+$)\b'.preg_quote($row["t"]).'\b/mS',
# $replaced = preg_replace('/(?='.preg_quote($row["t"], '/').'[^\]][^(]+$)\b'.preg_quote($row["t"], '/').'\b/mS',
# $replaced = preg_replace('/(?=\Q'.$row["t"].'\E[^\]][^(]+$)\b\Q'.$row["t"].'\E\b/mS',
    $replaced = preg_replace('/(?='.$row["t"].'[^\]][^(])\b'.$row["t"].'\b/mS',
                           '<a href="'.$row["u"].'">'.$row["t"].'</a>',
                           $replaced);
 }
echo $replaced;

?>
4

2 回答 2

1

这应该有效,至少在提供的测试用例中:

$replaced = preg_replace('/([.,\s!^]+)('.preg_quote($row["t"],'/').')([.,\s!$]+)(?!\()/mS',
                           '$1<a href="'.$row["u"].'">$2</a>$3',
                           $replaced);

\b当您的匹配项本身包含在某些边界内(例如 in Foobar (2))时,无法按预期工作,因此您应该专门提供允许的字符列表。我很快就放在[.,\s!^]那里[.,\s!$],您可能必须根据您的规格添加更多允许的字符(例如-_?)

于 2010-06-19T17:02:19.693 回答
0

我不完全确定您要做什么,但我看到“当单词组包含正则表达式语法字符时中断”,这让我认为您需要做的就是转义这些字符......即在它们之前放一个 \。

编辑:

我也很坚持这一点,但是如果向您展示我所拥有的,也许它可以帮助您:

<?php

$text = "<html><body><pre>
Replace all foo / bar / baz cases here:
Case 1: Text Foo text.
Case 2: Text 'Foo' Bar text Foo.
Case 3: Text Foobar (2) text.
Case 4: Text Bar & Baz.
Case 5: Text Bar Baz?
Case 6: Text Bar? & Baz?
Case 7: Text Bar-X.

Replace nothing here (text followed by brackets) or [inside square brackets]: 
Case 1: Text Foo (text).
Case 2: Text 'Foo' Bar (text) Foo (text).
Case 3: Text Foobar (2) (text).
Case 4: Text Bar & Baz (text).
Case 5: Text Bar Baz (text).
Case 6: Text Bar? & Baz (text).
Case 7: Text Bar-X (text).
Case 8: [Text Foo]
</pre></body></html>";

function convertRegexChars($string)
{
    $converted = str_replace("?","&#63;",$string);
    $converted = str_replace(".","&#46;",$converted);
    $converted = str_replace("*","&#42;",$converted);
    $converted = str_replace("+","&#43;",$converted);
    return $converted;
}

$s = array(
  array("t" => "Foo",         "u" => "http://www.foo.net"),
  array("t" => "'Foo' Bar",   "u" => "http://www.foo.net"),
  array("t" => "Foobar (2)",  "u" => "http://www.foo.net"),
  array("t" => "Bar & Baz",   "u" => "http://www.foo.net"),
  array("t" => "Bar Baz?",    "u" => "http://www.foo.net"),
  array("t" => "Bar? & Baz?", "u" => "http://www.foo.net"),
  array("t" => "Bar-X",       "u" => "http://www.foo.net")
 );

$replaced = convertRegexChars($text);
foreach ($s as $i => $row) {
    $txt = convertRegexChars($row['t']);
    $replaced = preg_replace('/(?='.$txt.'[^\]][^(])\b'.$txt.'\b/mS',
                           '<a href="'.$row["u"].'">'.$txt.'</a>',
                           $replaced);
 }
echo $replaced;

?>
于 2010-06-19T12:22:56.980 回答