python - 使用 lxml 在根元素之前/之后添加或添加 PI

Question

使用 lxml，我如何在根元素之前添加处理指令或在使用 lxml 的根元素之后添加 PI。

目前，以下示例不起作用：

from lxml import etree

root = etree.XML("<ROOT/>")
root.addprevious(etree.ProcessingInstruction("foo"))
print(etree.tounicode(root))

我得到：

<ROOT/>

代替：

<?foo?><ROOT/>

score 2 · Accepted Answer

您需要使用ElementTree，而不仅仅是Element在tounicode()：

from lxml import etree

root = etree.XML("<ROOT/>")
root.addprevious(etree.ProcessingInstruction("foo"))
print(etree.tounicode(root.getroottree()))

输出几乎是您想要的：

<?foo ?><ROOT/>

出现后出现额外的空格字符foo，因为lxml呈现PI为pi.target + " " + pi.text.

score 1 · Accepted Answer

实际上， anElement总是附加到 aElementTree即使它看起来“分离”：

root = etree.XML("<ROOT/>")
assert root.getroottree() is not None

当我们使用addprevious/addnext在根元素之前/之后插入处理指令时，PI 不会附加到父元素（没有任何父元素），而是附加到根树。

所以，问题在于tounicode（或tostring）的使用。最佳实践是打印根树的 XML，而不是根元素。

from lxml import etree

root = etree.XML("<ROOT/>")
root.addprevious(etree.ProcessingInstruction("foo"))
root.addnext(etree.ProcessingInstruction("bar"))

print(etree.tounicode(root))
# => "<ROOT/>"

print(etree.tounicode(root.getroottree()))
# => "<?foo ?><ROOT/><?bar ?>"

python - 使用 lxml 在根元素之前/之后添加或添加 PI

2 回答 2

Related

Reference