2

I have two questions:

  1. I am using TinyMCE editor and I want to remove empty tags from HTML. I am getting error "DOMDocument::loadHTML(): Unexpected end tag : p" when I pass text from TinyMCE editor and this error disappears when I directly pass text to TinyMCE strange! Please see code below.

  2. How do I prevent warning from DomDocument when incorrect HTML is passed?. <strong>Bold Item </b></strong>?

Here is an example text

<p style="text-align: justify;"> </p>
<p>blah blah blah <strong></strong>.</p>
<p style="text-align: justify;"> </p>
<p>paragraph three!!.</p>

My function

function remove_empty_tags ($text) {
    $dom = new DOMDocument;
    $dom->loadHTML($text);

    // fetch all the wanted nodes
    $xp = new DOMXPath($dom);
    foreach($xp->query('//*[not(node() or self::br) or normalize-space() = ""]') as $node) {
        $node->parentNode->removeChild($node);
    }

    // output the cleaned markup
    return $dom->saveXml($dom->getElementsByTagName('body')->item(0) );
 }

 echo remove_empty_tags($_POST['mce_editor']);
4

1 回答 1

1

使用以下功能error_reporting(0)

function remove_empty_tags ($text) {
    error_reporting(0); // added
    $dom = new DOMDocument;
    $dom->loadHTML($text);
    $xp = new DOMXPath($dom);
    foreach($xp->query('//*[not(node() or self::br) or normalize-space() = ""]') as $node) {
        $node->parentNode->removeChild($node);
    }
    return $dom->saveXml($dom->getElementsByTagName('body')->item(0) );
}
echo remove_empty_tags("<p>blah blah blah <strong><i></strong>.<em><span></em></span></p>");

我得到以下结果

<p>blah blah blah .</p>

你可以试试这个,但不确定它是否适合你TinyMCE例如这里

更新: 还有另一种使用simplexml_import_dom修复严重嵌套标签的方法

error_reporting(0);
$text = "<p>blah blah blah <strong><i></strong>.<em><span></em></span></p>";
$dom = new DOMDocument();
$dom->loadHTML($text);
$repaired = simplexml_import_dom($dom)->asXML();
echo $repaired;

这给了我以下结果

<p>blah blah blah <strong><i></i></strong><i>.<em><span></span></em></i></p>
于 2012-11-16T03:56:25.850 回答