85

我正在使用 Python 在 Python 中生成 XML 文档ElementTree,但是在转换为纯文本时,该tostring函数不包含XML 声明。

from xml.etree.ElementTree import Element, tostring

document = Element('outer')
node = SubElement(document, 'inner')
node.NewValue = 1
print tostring(document)  # Outputs "<outer><inner /></outer>"

我需要我的字符串包含以下 XML 声明:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>

但是,似乎没有任何记录在案的方法可以做到这一点。

是否有适当的方法来呈现 XML 声明ElementTree

4

11 回答 11

131

我惊讶地发现似乎没有办法使用ElementTree.tostring(). 但是,您可以使用ElementTree.ElementTree.write()将 XML 文档写入假文件:

from io import BytesIO
from xml.etree import ElementTree as ET

document = ET.Element('outer')
node = ET.SubElement(document, 'inner')
et = ET.ElementTree(document)

f = BytesIO()
et.write(f, encoding='utf-8', xml_declaration=True) 
print(f.getvalue())  # your XML file, encoded as UTF-8

看到这个问题。即使那样,我认为如果不自己编写前缀,您也无法获得“独立”属性。

于 2013-03-12T09:36:37.797 回答
32

我会使用 lxml(见http://lxml.de/api.html)。

那么你就可以:

from lxml import etree
document = etree.Element('outer')
node = etree.SubElement(document, 'inner')
print(etree.tostring(document, xml_declaration=True))
于 2013-03-12T08:50:45.607 回答
24

如果包含encoding='utf8',您将获得一个 XML 标头

xml.etree.ElementTree.tostring 使用 encoding='utf8' 编写 XML 编码声明

示例 Python 代码(适用于 Python 2 和 3):

import xml.etree.ElementTree as ElementTree

tree = ElementTree.ElementTree(
    ElementTree.fromstring('<xml><test>123</test></xml>')
)
root = tree.getroot()

print('without:')
print(ElementTree.tostring(root, method='xml'))
print('')
print('with:')
print(ElementTree.tostring(root, encoding='utf8', method='xml'))

Python 2 输出:

$ python2 example.py
without:
<xml><test>123</test></xml>

with:
<?xml version='1.0' encoding='utf8'?>
<xml><test>123</test></xml>

在 Python 3 中,您会注意到返回指示字节文字b前缀(就像 Python 2 一样):

$ python3 example.py
without:
b'<xml><test>123</test></xml>'

with:
b"<?xml version='1.0' encoding='utf8'?>\n<xml><test>123</test></xml>"
于 2017-02-27T21:25:42.440 回答
8

xml_declaration 参数

是否有在 ElementTree 中呈现 XML 声明的正确方法?

是的,并且不需要使用.tostring函数。根据ElementTree Documentation,您应该创建一个 ElementTree 对象,创建元素和子元素,设置树的根,最后xml_declaration在函数中使用参数.write,因此声明行包含在输出文件中。

你可以这样做:

import xml.etree.ElementTree as ET

tree = ET.ElementTree("tree")

document = ET.Element("outer")
node1 = ET.SubElement(document, "inner")
node1.text = "text"

tree._setroot(document)
tree.write("./output.xml", encoding = "UTF-8", xml_declaration = True)  

输出文件是:

<?xml version='1.0' encoding='UTF-8'?>
<outer><inner>text</inner></outer>
于 2020-10-16T19:49:47.870 回答
3

我最近遇到这个问题,经过一些代码挖掘,我发现下面的代码片段是函数的定义ElementTree.write

def write(self, file, encoding="us-ascii"):
    assert self._root is not None
    if not hasattr(file, "write"):
        file = open(file, "wb")
    if not encoding:
        encoding = "us-ascii"
    elif encoding != "utf-8" and encoding != "us-ascii":
        file.write("<?xml version='1.0' encoding='%s'?>\n" % 
     encoding)
    self._write(file, self._root, encoding, {})

所以答案是,如果您需要将 XML 标头写入文件,请设置除or以外的encoding参数,例如utf-8us-asciiUTF-8

于 2015-01-29T06:22:05.867 回答
2

使用包的最小工作示例ElementTree

import xml.etree.ElementTree as ET

document = ET.Element('outer')
node = ET.SubElement(document, 'inner')
node.text = '1'
res = ET.tostring(document, encoding='utf8', method='xml').decode()
print(res)

输出是:

<?xml version='1.0' encoding='utf8'?>
<outer><inner>1</inner></outer>
于 2018-09-23T05:35:40.123 回答
2

简单的

Python 2 和 3 的示例(编码参数必须为utf8):

import xml.etree.ElementTree as ElementTree

tree = ElementTree.ElementTree(ElementTree.fromstring('<xml><test>123</test></xml>'))
root = tree.getroot()
print(ElementTree.tostring(root, encoding='utf8', method='xml'))

从 Python 3.8 开始,这些东西有xml_declaration参数:

3.8 版中的新功能: xml_declaration 和 default_namespace 参数。

xml.etree.ElementTree.tostring(element, encoding="us-ascii", method="xml", *, xml_declaration=None, default_namespace=None, short_empty_elements=True) 生成 XML 元素的字符串表示,包括所有子元素. element 是一个 Element 实例。encoding 1 是输出编码(默认为 US-ASCII)。使用 encoding="unicode" 生成 Unicode 字符串(否则生成字节串)。方法是“xml”、“html”或“text”(默认为“xml”)。xml_declaration、default_namespace 和 short_empty_elements 与 ElementTree.write() 中的含义相同。返回包含 XML 数据的(可选)编码字符串。

Python 3.8 及更高版本的示例:

import xml.etree.ElementTree as ElementTree

tree = ElementTree.ElementTree(ElementTree.fromstring('<xml><test>123</test></xml>'))
root = tree.getroot()
print(ElementTree.tostring(root, encoding='unicode', method='xml', xml_declaration=True))
于 2020-10-16T08:50:08.067 回答
1

另一个非常简单的选项是将所需的标头连接到 xml 字符串,如下所示:

xml = (bytes('<?xml version="1.0" encoding="UTF-8"?>\n', encoding='utf-8') + ET.tostring(root))
xml = xml.decode('utf-8')
with open('invoice.xml', 'w+') as f:
    f.write(xml)
于 2019-02-06T13:54:31.443 回答
0

我会使用ET

try:
    from lxml import etree
    print("running with lxml.etree")
except ImportError:
    try:
        # Python 2.5
        import xml.etree.cElementTree as etree
        print("running with cElementTree on Python 2.5+")
    except ImportError:
        try:
            # Python 2.5
            import xml.etree.ElementTree as etree
            print("running with ElementTree on Python 2.5+")
        except ImportError:
            try:
                # normal cElementTree install
                import cElementTree as etree
                print("running with cElementTree")
            except ImportError:
               try:
                   # normal ElementTree install
                   import elementtree.ElementTree as etree
                   print("running with ElementTree")
               except ImportError:
                   print("Failed to import ElementTree from any known place")

document = etree.Element('outer')
node = etree.SubElement(document, 'inner')
print(etree.tostring(document, encoding='UTF-8', xml_declaration=True))
于 2015-04-23T11:01:04.267 回答
0

如果您只想打印,则此方法有效。当我尝试将其发送到文件时出现错误...

import xml.dom.minidom as minidom
import xml.etree.ElementTree as ET
from xml.etree.ElementTree import Element, SubElement, Comment, tostring

def prettify(elem):
    rough_string = ET.tostring(elem, 'utf-8')
    reparsed = minidom.parseString(rough_string)
    return reparsed.toprettyxml(indent="  ")
于 2016-03-09T15:36:11.330 回答
0

在声明中包含“独立”

我没有找到standalone在文档中添加参数的任何替代方法,因此我调整了ET.tosting函数以将其作为参数。

from xml.etree import ElementTree as ET

# Sample
document = ET.Element('outer')
node = ET.SubElement(document, 'inner')
et = ET.ElementTree(document)

 # Function that you need   
 def tostring(element, declaration, encoding=None, method=None,):
     class dummy:
         pass
     data = []
     data.append(declaration+"\n")
     file = dummy()
     file.write = data.append
     ET.ElementTree(element).write(file, encoding, method=method)
     return "".join(data)
# Working example
xdec = """<?xml version="1.0" encoding="UTF-8" standalone="no" ?>"""    
xml = tostring(document, encoding='utf-8', declaration=xdec)
于 2018-11-09T15:04:28.837 回答