php - 从字符串中去除 HTML 标签

Question

我有一个网站，它解析来自网站的 RSS 提要并将它们发布在页面上。
在我的网站后面运行的脚本，它读取并重新格式化 RSS 提要，目前正在剥离所有 HTML 标记。

这是代码；
$description = strip_tags($description);

我想允许标签<p>，<a>或者<br />但如果我这样做，由于某种原因，我的网站会变得一团糟。就像标题上面会有很大的空间。
解决方案是什么？

=== 编辑 === （更多代码）

// get all of the sources of news from the database $get_sources = $db->query("SELECT * FROM ".$prefix."sources ORDER BY last_crawled ASC"); while ($source = $db->fetch_array($get_sources)) {

$feed = new SimplePie($source[url]);

$feed->handle_content_type();  

foreach ($feed->get_items() as $item)  
{  

    $title = $item->get_title();  
    $link = $item->get_link();
    $description = $item->get_content();

    // strip all html
    $description = strip_tags($description);

    // format the data to make sure it's all fine
    $title = html_entity_decode($title, ENT_QUOTES, 'UTF-8');

    // create the path, or slug if you will
    $path = post_slug($title);

    $description = html_entity_decode($description, ENT_QUOTES, 'UTF-8');

score 3 · Accepted Answer

在剥离标签之前，处理字符串替换以转换您想要保留的特殊字符。

$source = str_replace('<p>', '&lt;p&gt;', $source);
$source = strip_tags($source);

然后用于htmlspecialchars_decode(trim($source))输出到html。

我敢打赌，您的页面布局出错的原因与 css 相关。仔细查看您生成的源代码（如果可能，使用 firebug）并确保每个 html 元素也有一个相应的关闭标记，并且您的脚本没有更改任何有意的 html 元素，尽管我不知道他们为什么会是。

尝试将脚本的输出隔离到空白页面，以便您可以仔细查看正在发生的事情。然后，一旦您确定一切正常，如果问题仍然存在，请尝试将输出放在页面的不同部分。此外，请确保修剪空白。

让我们知道你发现什么。

php - 从字符串中去除 HTML 标签

1 回答 1

Related

Reference