5

当子元素与父元素位于不同的命名空间时,我试图在 ElementTree 或 lxml 中获得命名空间的紧凑表示。这是基本示例:

from lxml import etree

country = etree.Element("country")

name = etree.SubElement(country, "{urn:test}name")
name.text = "Canada"
population = etree.SubElement(country, "{urn:test}population")
population.text = "34M"
etree.register_namespace('tst', 'urn:test')

print( etree.tostring(country, pretty_print=True) )

我也尝试过这种方法:

ns = {"test" : "urn:test"}

country = etree.Element("country", nsmap=ns)

name = etree.SubElement(country, "{test}name")
name.text = "Canada"
population = etree.SubElement(country, "{test}population")
population.text = "34M"

print( etree.tostring(country, pretty_print=True) )

在这两种情况下,我都会得到这样的结果:

<country>
    <ns0:name xmlns:ns0="urn:test">Canada</ns0:name>
    <ns1:population xmlns:ns1="urn:test">34M</ns1:population>
</country>

虽然这是正确的,但我希望它不那么冗长 - 这可能成为大型数据集的一个真正问题(特别是因为我使用的 NS 比“urn:test”大得多)。

如果我对“国家”在“urn:test”命名空间内感到满意并像这样声明它(在上面的第一个示例中):

country = etree.Element("{test}country")

然后我得到以下输出:

<ns0:country xmlns:ns0="urn:test">
    <ns0:name>Canada</ns0:name>
    <ns0:population>34M</ns0:population>
</ns0:country>

但我真正想要的是:

<country xmlns:ns0="urn:test">
    <ns0:name>Canada</ns0:name>
    <ns0:population>34M</ns0:population>
<country>

有任何想法吗?

4

3 回答 3

2
  1. 元素的全名包含 of {namespace-url}elementName,而不是{prefix}elementName

    >>> from lxml import etree as ET
    >>> r = ET.Element('root', nsmap={'tst': 'urn:test'})
    >>> ET.SubElement(r, "{urn:test}child")
    <Element {urn:test}child at 0x2592a80>
    >>> ET.tostring(r)
    '<root xmlns:tst="urn:test"><tst:child/></root>'
    
  2. 在您的情况下,如果您更新默认命名空间,则可能会更紧凑。不幸的是,lxml似乎不允许空的 XML 命名空间,但是您说,您可以将父标签与子元素放在同一命名空间中,因此您可以将默认命名空间设置为子元素的命名空间:

    >>> r = ET.Element('{urn:test}root', nsmap={None: 'urn:test'})
    >>> ET.SubElement(r, "{urn:test}child")
    <Element {urn:test}child at 0x2592b20>
    >>> ET.SubElement(r, "{urn:test}child")
    <Element {urn:test}child at 0x25928f0>
    >>> ET.tostring(r)
    '<root xmlns="urn:test"><child/><child/></root>'
    
于 2013-04-22T13:21:41.577 回答
1
from xml.etree import cElementTree as ET
##ET.register_namespace('tst', 'urn:test')
country = ET.Element("country")
name = ET.SubElement(country, "{urn:test}name")
name.text = "Canada"
population = ET.SubElement(country, "{urn:test}population")
population.text = "34M"
print prettify(country)

上面将给出(不注册任何命名空间):

<?xml version="1.0" ?>
<country xmlns:ns0="urn:test">
  <ns0:name>Canada</ns0:name>
  <ns0:population>34M</ns0:population>
</country>

而且,当我删除评论部分时,它将给出::

<?xml version="1.0" ?>
<country xmlns:tst="urn:test">
  <tst:name>Canada</tst:name>
  <tst:population>34M</tst:population>
</country>

注意:prettify函数在这里

于 2013-04-21T13:37:01.130 回答
1

这段代码:

from lxml import etree

ns = {"ns0" : "urn:test"}
country = etree.Element("country", nsmap=ns)

name = etree.SubElement(country, "{urn:test}name")
name.text = "Canada"

population = etree.SubElement(country, "{urn:test}population")
population.text = "34M"

print(etree.tostring(country, pretty_print=True))

似乎提供了所需的输出:

<country xmlns:ns0="urn:test">
  <ns0:name>Canada</ns0:name>
  <ns0:population>34M</ns0:population>
</country>

但你仍然需要维护nsmap自己。

于 2013-04-18T23:37:02.120 回答