0

任何正则表达式忍者都想出一个 PHP 解决方案来清除任何 http/url 中的标签,但将标签留在文本的其余部分?

例如:

the word <cite>printing</cite> is in http://www.thisis<cite>printing</cite>.com

应该变成:

the word <cite>printing</cite> is in http://www.thisisprinting.com
4

3 回答 3

1

此替换的适当正则表达式可能是:

#(https?://)(.*?)<cite>(.*?)</cite>([^\s]*)#s
  1. s在所有换行符中匹配的标志。

  2. 使用lazy标签之间的选择是准确的,而不是逃避更多相似的标签

片段:

<?php
$str = "the word <cite>printing<cite> is in http://www.thisis<cite>printing</cite>.com";
$replaced = preg_replace('#(https?://)(.*?)<cite>(.*?)</cite>([^\s]*)#s', "$1$2$3$4", $str);
echo $replaced;

// Output: the word <cite>printing<cite> is in http://www.thisisprinting.com

现场演示

于 2013-10-24T22:02:35.653 回答
1

这就是我要做的:

<?php
//a callback function wrapper for strip_tags
function strip($matches){
    return strip_tags($matches[0]);
}

//the string
$str = "the word <cite>printing<cite> is in http://www.thisis<cite>printing</cite>.com";
//match a url and call the strip callback on it
$str = preg_replace_callback("/:\/\/[^\s]*/", 'strip', $str);

//prove that it works
var_dump(htmlentities($str));

http://codepad.viper-7.com/XiPcs9

于 2013-10-24T21:56:18.737 回答
0

假设您可以从文本中识别 URL,您可以:

$str = 'http://www.thisis<cite>printing</cite>.com';
$str = preg_replace('~</?cite>~i', "", $str);
echo $str;

输出:

http://www.thisisprinting.com
于 2013-10-24T21:48:46.627 回答