php - 删除 PHP 中的嵌套 bbcode（引号）

Question

我正在尝试从我的公告板中删除嵌套引用，但我遇到了一些问题。

示例输入：

[引用作者=personX 链接=topic=12.msg1910#msg1910 date=1282745641]
[quote author=PersonY link=topic=12.msg1795#msg1795 date=1282727068]

The message in the original quote

[/quote]
第二条消息引用了第一条消息

[/引用]

[引用作者=PersonZ 链接=主题=1.msg1#msg1 日期=1282533805]

随机的第三个报价

[/引用]

示例输出

[引用作者=personX 链接=topic=12.msg1910#msg1910 date=1282745641]

第二个报价中的消息

[/引用]

[引用作者=PersonZ 链接=主题=1.msg1#msg1 日期=1282533805]

随机的第三个报价

[/引用]

如您所见，嵌套的引号（原始消息）与引号标记一起被删除。

我似乎无法弄清楚。

当我尝试

$toRemove = '(\\[)(quote)(.*?)(\\])';
$string = $txt;
$found = 0; echo preg_replace("/($toRemove)/e", '$found++ ? \'\' : \'$1\'', $string);

它会删除除第一个之外的所有引用标签，

但是当我将代码扩展为：

$toRemove = '(\\[)(quote)(.*?)(\\])(.*?)(\\[\\/quote\\])';
$string = $txt;
$found = 0; echo preg_replace("/($toRemove)/e", '$found++ ? \'\' : \'$1\'', $string);

它根本停止做任何事情。

对此有什么想法吗？

编辑：

谢谢你的帮助，哈吉。

不过，我总是遇到麻烦。

while 循环

while ( $input = preg_replace_callback( '~\[quoute.*?\[/quote\]~i', 'replace_callback', $input ) ) {
// replace every occurence
}

导致页面无限循环，当被删除时（连同额外的 u 在 quoute 中），页面不做任何事情。

我已经确定原因是匹配

当更改为

$input = preg_replace_callback( '/\[quote(.*?)/i', 'replace_callback', $input );

代码确实开始工作，但是当更改为

$input = preg_replace_callback( '/\[quote(.*?)\[\/quote\]/i', 'replace_callback', $input );

它再次停止做任何事情。

此外，undo_replace 函数存在一个问题，因为它永远不会找到存储的哈希值，它只会给出有关未找到索引的警告。我猜与 sha1 匹配的正则表达式无法正常工作。

我现在拥有的完整代码：

$cache = array();
$input = $txt;

function replace_callback( $matches ) {
    global $cache;
    $hash = sha1( $matches[0] );
    $cache["hash"] = $matches[0];
    return "REPLACE:$hash";
}



// replace all quotes with placeholders
$input = preg_replace_callback( '/\[quote(.*?)\[quote\]/i', 'replace_callback', $input );

function undo_replace( $matches ) {
    global $cache;
    return $cache[$matches[1]];
}

// restore the outer most quotes
$input = preg_replace_callback( '~REPLACE:[a-f0-9]{40}~i', 'undo_replace', $input );

// remove the references to the inner quotes
$input = preg_replace( '~REPLACE:[a-f0-9]{40}~i', '', $input );

echo $input;

再次感谢任何想法家伙:)

score 2 · Accepted Answer

很容易发现第一个是唯一留下的：

'$found++ ? \'\' : \'$1\''

开始时 $found 未定义并评估为 false，因此返回 $1。然后 $found 递增到 1 （ undefined + 1 = 1 ），因此它大于零，并且每次调用它时都会进一步递增。因为所有与零不同的东西都被评估为真，之后你总是会得到''回来。

你想做的是这样的

$cache = array();

function replace_callback( $matches ) {
    global $cache;
    $hash = sha1sum( $matches[0] );
    $cache[$hash] = $matches[0];
    return "REPLACE:$hash";
}

// replace all quotes with placeholders
$count = 0;
do {
    $input = preg_replace_callback( '~\[quoute.*?\[/quote\]~i', 'replace_callback', $input, -1, $count );
    // replace every occurence
} while ($count > 0);

function undo_replace( $matches ) {
    global $cache;
    return $cache[$matches[1]];
}

// restore the outer most quotes
$input = preg_replace_callback( '~REPLACE:[a-f0-9]{40}~i', 'undo_replace', $input );

// remove the references to the inner quotes
$input = preg_replace( '~REPLACE:[a-f0-9]{40}~i', '', $input );

此代码未经测试，因为我手头没有 PHP 来测试它。如果有任何您无法修复的错误，请在此处发布，我会修复它们。

干杯，
哈吉

score 0 · Accepted Answer

我已经搜索了几个使用 preg_replace 嵌套引号的解决方案，但没有一个有效。所以我根据我的要求尝试了我的小版本。

$position = strrpos($string, '[/quote:');  // this will get the position of last quote
$text = substr(strip_tags($string),$position+17); // this will get the data after the last quote used.

希望这会对某人有所帮助。

php - 删除 PHP 中的嵌套 bbcode（引号）

2 回答 2

Related

Reference