php - PHP DOMDocument 用 HTML 字符串替换 DOMElement 子元素

Question

使用 PHP，我试图获取从 WYSIWYG 编辑器传递的 HTML 字符串，并用新的 HTML 替换预加载的 HTML 文档中元素的子元素。

到目前为止，我正在加载标识要通过 ID 更改的元素的文档，但是将 HTML 转换为可以放置在 DOMElement 中的内容的过程让我望而却步。

libxml_use_internal_errors(true);

$doc = new DOMDocument();
$doc->loadHTML($html);

$element = $doc->getElementById($item_id);
if(isset($element)){
    //Remove the old children from the element
    while($element->childNodes->length){
        $element->removeChild($element->firstChild);
    }

    //Need to build the new children from $html_string and append to $element
}

score 14 · Accepted Answer

如果 HTML 字符串可以解析为 XML，则可以这样做（清除所有子节点的元素后）：

$fragment = $doc->createDocumentFragment();
$fragment->appendXML($html_string);
$element->appendChild($fragment);

如果 $html_string 不能被解析为 XML，它将失败。如果是这样，你将不得不使用 loadHTML()，它不那么严格——但它会在你必须剥离的片段周围添加元素。

与 PHP 不同，Javascript 具有允许您非常轻松地完成此操作的 innerHTML 属性。我在一个项目中需要类似的东西，所以我扩展了 PHP 的 DOMElement 以包含类似 Javascript 的 innerHTML 访问。

使用它，您可以像在 Javascript 中一样访问 innerHTML 属性并对其进行更改：

echo $element->innerHTML;
$elem->innerHTML = '<a href="http://example.org">example</a>';

来源：http ://www.keyvan.net/2012/11/php-domdocument-replace-domelement-child-with-html-string/

score 3 · Accepted Answer

当前接受的答案建议使用 appendXML()，但承认它不会处理复杂的 html，例如原始问题中指定的 WYSISYG 编辑器返回的内容。正如所建议的 loadHTML() 可以解决这个问题。但还没有人展示如何。

这是我认为对解决编码问题、“文档片段为空”警告和“错误文档错误”错误的原始问题的最佳/正确答案，如果他们从头开始编写，可能会遇到这些错误。我知道我是在按照之前回复中的提示找到它们的。

这是来自我支持的网站的代码，它将 WordPress 侧边栏内容插入到帖子的 $content 中。它假定 $doc 是一个有效的 DOMDocument，类似于 $doc 在原始问题中的定义方式。它还假设 $element 是您希望在其之后插入侧边栏内容（或其他内容）的标签。

            // NOTE: Cannot use a document fragment here as the AMP html is too complex for the appendXML function to accept.
            // Instead create it as a document element and insert that way.
            $node = new DOMDocument();
            // Note that we must encode it correctly or strange characters may appear.
            $node->loadHTML( mb_convert_encoding( $sidebarContent, 'HTML-ENTITIES', 'UTF-8') );
            // Now we need to move this document element into the scope of the content document 
            // created above or the insert/append will be rejected.
            $node = $doc->importNode( $node->documentElement, true );
            // If there is a next sibling, insert before it.
            // If not, just add it at the end of the element we did find.
            if (  $element->nextSibling ) {
                $element->parentNode->insertBefore( $node, $element->nextSibling );
            } else {
                $element->parentNode->appendChild($node);
            }

完成所有这些之后，如果您不想拥有带有 body 标记的完整 HTML 文档的源代码等等，您可以使用以下命令生成更本地化的 html：

    // Now because we have moved the post content into a full document, we need to get rid of the 
    // extra elements that make it a document and not a fragment
    $body = $doc->getElementsByTagName( 'body' );
    $body = $body->item(0);

    // If you need an element with a body tag, you can do this.
    // return $doc->savehtml( $body );

    // Extract the html from the body tag piece by piece to ensure valid html syntax in destination document
    $bodyContent = ''; 
    foreach( $body->childNodes as $node ) { 
            $bodyContent .= $body->ownerDocument->saveHTML( $node ); 
    } 
    // Now return the full content with the new content added. 
    return $bodyContent;

score 1 · Accepted Answer

我知道这是一个旧线程（但回复此问题，因为也在寻找解决方案）。我做了一个简单的方法，在使用它时只用一行替换内容。为了更好地理解该方法，我还添加了一些上下文命名函数。

这现在是我的库的一部分，所以这就是这里所有函数名称的原因，所有函数都以前缀“su”开头。

它非常易于使用且功能强大（而且代码非常少）。

这是代码：

function suSetHtmlElementById( &$oDoc, &$s, $sId, $sHtml, $bAppend = false, $bInsert = false, $bAddToOuter = false )
 {
    if( suIsValidString( $s ) && suIsValidString( $sId ))
    {
     $bCreate = true;
     if( is_object( $oDoc ))
     {
       if( !( $oDoc instanceof DOMDocument ))
        { return false; }
       $bCreate = false;
     }

     if( $bCreate )
      { $oDoc = new DOMDocument(); }

     libxml_use_internal_errors(true);
     $oDoc->loadHTML($s);
     libxml_use_internal_errors(false);
     $oNode = $oDoc->getElementById( $sId );

     if( is_object( $oNode ))
     { 
       $bReplaceOuter = ( !$bAppend && !$bInsert );

       $sId = uniqid('SHEBI-');
       $aId = array( "<!-- $sId -->", "<!--$sId-->" );

       if( $bReplaceOuter )
       {
         if( suIsValidString( $sHtml ) )
         {
             $oNode->parentNode->replaceChild( $oDoc->createComment( $sId ), $oNode );
             $s = $oDoc->saveHtml();
             $s = str_replace( $aId, $sHtml, $oDoc->saveHtml());
         }
         else { $oNode->parentNode->removeChild( $oNode ); 
                $s = $oDoc->saveHtml();
              }
         return true;
       }

       $bReplaceInner = ( $bAppend && $bInsert );
       $sThis = null;

       if( !$bReplaceInner )
       {
         $sThis = $oDoc->saveHTML( $oNode );
         $sThis = ($bInsert?$sHtml:'').($bAddToOuter?$sThis:(substr($sThis,strpos($sThis,'>')+1,-(strlen($oNode->nodeName)+3)))).($bAppend?$sHtml:''); 
       }

       if( !$bReplaceInner && $bAddToOuter )
       { 
          $oNode->parentNode->replaceChild( $oDoc->createComment( $sId ), $oNode );
          $sId = &$aId;
       }
       else { $oNode->nodeValue = $sId; }

       $s = str_replace( $sId, $bReplaceInner?$sHtml:$sThis, $oDoc->saveHtml());
       return true;
     }
    } 
    return false; 
 }

// A function of my library used in the function above:
function suIsValidString( &$s, &$iLen = null, $minLen = null, $maxLen = null )
{
  if( !is_string( $s ) || !isset( $s{0} ))
   { return false; }

  if( $iLen !== null )
   { $iLen = strlen( $s ); }

  return (( $minLen===null?true:($minLen > 0 && isset( $s{$minLen-1} ))) && 
           $maxLen===null?true:($maxLen >= $minLen && !isset( $s{$maxLen})));   
}

一些上下文函数：

 function suAppendHtmlById( &$s, $sId, $sHtml, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, true, false ); }

 function suInsertHtmlById( &$s, $sId, $sHtml, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, false, true ); }

 function suAddHtmlBeforeById( &$s, $sId, $sHtml, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, false, true, true ); }

 function suAddHtmlAfterById( &$s, $sId, $sHtml, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, true, false, true ); }

 function suSetHtmlById( &$s, $sId, $sHtml, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, true, true ); }

 function suReplaceHtmlElementById( &$s, $sId, $sHtml, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, false, false ); }

 function suRemoveHtmlElementById( &$s, $sId, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, null, false, false ); }

如何使用它：

在以下示例中，我假设已经将内容加载到一个名为的变量中$sMyHtml，并且该变量$sMyNewContent包含一些新的 html。该变量$sMyHtml包含一个名为/ id 为“ example_id”的元素。

// Example 1: Append new content to the innerHTML of an element (bottom of element):
if( suAppendHtmlById( $sMyHtml, 'example_id', $sMyNewContent ))
 { echo $sMyHtml; }
 else { echo 'Element not found?'; }

// Example 2: Insert new content to the innerHTML of an element (top of element):
suInsertHtmlById( $sMyHtml, 'example_id', $sMyNewContent );    

// Example 3: Add new content ABOVE element:
suAddHtmlBeforeById( $sMyHtml, 'example_id', $sMyNewContent );    

// Example 3: Add new content BELOW/NEXT TO element:
suAddHtmlAfterById( $sMyHtml, 'example_id', $sMyNewContent );    

// Example 4: SET new innerHTML content of element:
suSetHtmlById( $sMyHtml, 'example_id', $sMyNewContent );    

// Example 5: Replace entire element with new content:
suReplaceHtmlElementById( $sMyHtml, 'example_id', $sMyNewContent );    

// Example 6: Remove entire element:
suSetHtmlElementById( $sMyHtml, 'example_id' );

score 1 · Accepted Answer

您可以loadHTML()在代码片段上使用，然后将生成的创建节点附加到原始 DOM 树中。

score 1 · Accepted Answer

我知道这很旧，但是当前的答案都没有显示如何用存储在字符串中的 HTML 替换 DOMDocument 中的 DOMNode(s) 的最小工作示例。

// the HTML fragment we want to use as the replacement
$htmlReplace = '<div><strong>foo</strong></div>';
// the HTML of the original document
$htmlHaystack = '<p><a id="tag">bar</a></p>';

// load the HTML replacement fragment
$domDocumentReplace = new \DOMDocument;
$domDocumentReplace->loadHTML($htmlReplace, LIBXML_HTML_NOIMPLIED);

// load the HTML of the document
$domDocumentHaystack = new \DOMDocument;
$domDocumentHaystack->loadHTML($htmlHaystack, LIBXML_HTML_NOIMPLIED);

// import the replacement node into the document
$htmlReplaceNode = $domDocumentHaystack->importNode($domDocumentReplace->documentElement, true);

// find the DOMNode(s) we want to replace - in this case #tag (to keep the example simple)
$domNodeTag = $domDocumentHaystack->getElementById('tag');

// replace the node
$domNodeTag->parentNode->replaceChild($htmlReplaceNode, $domNodeTag);

// output the new HTML of the document
echo $domDocumentHaystack->saveHTML($domDocumentHaystack->documentElement);
// <p><div><strong>foo</strong></div></p>

php - PHP DOMDocument 用 HTML 字符串替换 DOMElement 子元素

5 回答 5

Related

Reference