python - 在 Python 中使用 ElementTree 更改命名空间前缀

Question

默认情况下，当您调用 ElementTree.parse(someXMLfile) 时，Python ElementTree 库会在每个解析的节点前面加上 Clark 表示法中的命名空间 URI：

    {http://example.org/namespace/spec}mynode

这使得在代码后面通过名称访问特定节点变得非常痛苦。

我已经阅读了有关 ElementTree 和命名空间的文档，看起来该iterparse()函数应该允许我更改解析器为命名空间添加前缀的方式，但对于我的生活，我实际上无法让它更改前缀。似乎这可能在 ns-start 事件触发之前在后台发生，如下例所示：

for event, elem in iterparse(source):
    if event == "start-ns":
        namespaces.append(elem)
    elif event == "end-ns":
        namespaces.pop()
    else:
        ...

如何让它改变前缀行为以及函数结束时返回的正确内容是什么？

score 6 · Accepted Answer

您不需要特别使用iterparse. 相反，以下脚本：

from cStringIO import StringIO
import xml.etree.ElementTree as ET

NS_MAP = {
    'http://www.red-dove.com/ns/abc' : 'rdc',
    'http://www.adobe.com/2006/mxml' : 'mx',
    'http://www.red-dove.com/ns/def' : 'oth',
}

DATA = '''<?xml version="1.0" encoding="utf-8"?>
<rdc:container xmlns:mx="http://www.adobe.com/2006/mxml"
                 xmlns:rdc="http://www.red-dove.com/ns/abc"
                 xmlns:oth="http://www.red-dove.com/ns/def">
  <mx:Style>
    <oth:style1/>
  </mx:Style>
  <mx:Style>
    <oth:style2/>
  </mx:Style>
  <mx:Style>
    <oth:style3/>
  </mx:Style>
</rdc:container>'''

tree = ET.parse(StringIO(DATA))
some_node = tree.getroot().getchildren()[1]
print ET.fixtag(some_node.tag, NS_MAP)
some_node = some_node.getchildren()[0]
print ET.fixtag(some_node.tag, NS_MAP)

生产

（'mx：样式'，无）
（'其他：style2'，无）

它显示了如何访问已解析树中各个节点的完全限定标签名称。您应该能够根据您的特定需求进行调整。

score 2 · Accepted Answer

xml.etree.ElementTree 似乎没有固定标签，嗯，不是根据文档。但是，我查看了 fixtag 的一些源代码，您可以这样做：

import xml.etree.ElementTree as ET

for event, elem in ET.iterparse(inFile, events=("start", "end")):
    namespace, looktag = string.split(elem.tag[1:], "}", 1)

你在looktag中有标签字符串，适合查找。命名空间在命名空间中。

python - 在 Python 中使用 ElementTree 更改命名空间前缀

2 回答 2

Related

Reference