php - 删除html标签

Question

目前，我使用 strip_tags 从我处理的字符串中删除所有 html 标签。但是，我最近注意到，它加入了包含在已删除标签中的单词，即

$str = "<li>Hello</li><li>world</li>";
$result = strip_tags($str);
echo $result;
(prints HelloWorld)

你怎么能解决这个问题？

score 2 · Accepted Answer

这将用空格替换所有 html 标记（任何形式为 < ABC >，实际上，不检查它是否真的是 html），然后将可能的双空格替换为单个空格并删除开始或结束空格。

$str = preg_replace("/<.*?>/", " ", $str);
$str = trim(str_replace("  ", " ", $str));

score 2 · Accepted Answer

您可以尝试哪种正则表达式模式最好以及要替换的内容:)

// ------------------------------------ 

function strip_html_tags($string) { 

    $string = str_replace("\r", ' ', $string); 
    $string = str_replace("\n", ' ', $string); 
    $string = str_replace("\t", ' ', $string); 
##  $string = str_replace("<li>', "\n* ", $string); 

##  $pattern = "/<.*?>/"; 
    $pattern = '/<[^>]*>/'; 

    $string= preg_replace ($pattern, ' ', $string); 

    $string= trim(preg_replace('/ {2,}/', ' ', $string));

return $string; 

}

// ------------------------------------

您还可以添加特殊替换，例如：'<li>'to "\n* "... 或其他:)

score 1 · Accepted Answer

From your code i discover that there was no initial space in between the words Hello Word and you don't expect the strip_tags function to add it for you, so for the strip_tags function to produce exactly what you want, i added a space after the first list tag and the result was Hello world.

You can copy and paste this code and run to see the difference.

    $str = "<li>Hello</li> <li>world</li>";
    $result = strip_tags($str);
    echo $result;
    //Expected result after Execution  is Hello world

score 1 · Accepted Answer

1

使用htmlentities()会更好

它不会删除 <>，而是将它们转义。

于 2011-12-11T17:02:36.277 回答

score 1 · Accepted Answer

这一切都取决于剥离 HTML 标签后您想要的输出。例如：

如果您希望将<li>标签转换为简单的项目列表，我建议您使用str_replaceto和<li>with替换。*</li>\n

strip_tags的建议是去掉 HTML 标签而不进行任何其他转换。

score 1 · Accepted Answer

echo strip_tags( str_replace( '>', '> ', $string ));

在所有情况下，这都应该完全符合您的要求。

php - 删除html标签

6 回答 6

Related

Reference