0

当使用 PHP 的 DOMDocument 并将 preserveWhiteSpace 设置为 false 并将 formatOutput 设置为 true 时,混合内容中的空白不会始终保持不变,即使在同一个元素中也是如此。

Source XML:
<p><span>one</span> <span>two</span> text <span>three</span> <span>four</span></p>

Expected output:
<p><span>one</span> <span>two</span> text <span>three</span> <span>four</span></p>

Actual output (space lost between "one" and "two"):
<p><span>one</span><span>two</span> text <span>three</span> <span>four</span></p>

使用另一个示例表明在某些情况下保留了空白:

$examples = array(
    '<p>text <span>one</span> <span>two</span> text <span>three</span> <span>four</span></p>',
    '<p><span>one</span> <span>two</span> text <span>three</span> <span>four</span></p>',
);

foreach ($examples as $example) {
    $doc = new DOMDocument;
    $doc->preserveWhiteSpace = false;
    $doc->loadXML($example);
    $doc->formatOutput = true;

    print $doc->saveXML();
}

// <p>text <span>one</span> <span>two</span> text <span>three</span> <span>four</span></p>
// <p><span>one</span><span>two</span> text <span>three</span> <span>four</span></p>

我猜测 libxml 用于检测混合内容的启发式方法不会在元素内向前看,因此只有在找到包含实际文本的文本节点后才开始保留空文本节点。

这是 a) libxml 中的错误(即使它警告自动格式化可能很危险)和/或 b) 使用 DTD 可以避免的事情?

4

1 回答 1

0

可以通过使用 DTD 并将元素声明为混合内容来防止丢失空白:

<?php

$xml = '<!DOCTYPE p [
<!ELEMENT p (#PCDATA|span)*>
<!ELEMENT span (#PCDATA)>
]>
<p><span>one</span> <span>two</span> text <span>three</span> <span>four</span></p>';

$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadXML($xml);
$doc->formatOutput = true;

print $doc->saveXML($doc->documentElement);

// <p><span>one</span> <span>two</span> text <span>three</span> <span>four</span></p>
于 2013-09-10T10:33:34.093 回答