使用良好的交互式控制台(如 IPython)是一件很棒的事情。[旁注:我更喜欢 ElementTree,因为我更喜欢界面,但无论如何]
让我们深入研究。首先阅读它:
In [51]: import xml.dom.minidom as minidom
In [52]: dom = minidom.parse("log.xml")
In [53]: dom
Out[53]: <xml.dom.minidom.Document instance at 0x97082ec>
现在查看路径内部:
In [55]: dom.getElementsByTagName("paths")
Out[55]: [<DOM Element: paths at 0x97086cc>]
In [56]: dom.getElementsByTagName("paths")[0]
Out[56]: <DOM Element: paths at 0x97086cc>
In [57]: vars(dom.getElementsByTagName("paths")[0])
Out[57]:
{'_attrs': {},
'_attrsNS': {},
'childNodes': [<DOM Text node "u'\n\n'">,
<DOM Element: path at 0x970874c>,
<DOM Text node "u' \n\n'">,
<DOM Element: path at 0x97088ac>,
<DOM Text node "u'\n\n'">],
'namespaceURI': None,
'nextSibling': <DOM Text node "u'\n'">,
'nodeName': u'paths',
'ownerDocument': <xml.dom.minidom.Document instance at 0x97082ec>,
'parentNode': <DOM Element: logentry at 0x970848c>,
'prefix': None,
'previousSibling': <DOM Text node "u'\n'">,
'tagName': u'paths'}
看看childNodes
:
In [58]: dom.getElementsByTagName("paths")[0].childNodes
Out[58]:
[<DOM Text node "u'\n\n'">,
<DOM Element: path at 0x970874c>,
<DOM Text node "u' \n\n'">,
<DOM Element: path at 0x97088ac>,
<DOM Text node "u'\n\n'">]
空格很重要,所以这有点让人头疼。但是可以扔掉非元素:
In [61]: elements = [x for x in dom.getElementsByTagName("paths")[0].childNodes if isinstance(x, minidom.Element)]
In [62]: elements
Out[62]: [<DOM Element: path at 0x970874c>, <DOM Element: path at 0x97088ac>]
向内看:
In [65]: elements
Out[65]: [<DOM Element: path at 0x970874c>, <DOM Element: path at 0x97088ac>]
In [66]: vars(elements[0])
Out[66]:
{'_attrs': {u'action': <xml.dom.minidom.Attr instance at 0x970880c>,
u'kind': <xml.dom.minidom.Attr instance at 0x97087ac>},
'_attrsNS': {(None, u'action'): <xml.dom.minidom.Attr instance at 0x970880c>,
(None, u'kind'): <xml.dom.minidom.Attr instance at 0x97087ac>},
'childNodes': [<DOM Text node "u'/branches/'...">],
'namespaceURI': None,
'nextSibling': <DOM Text node "u' \n\n'">,
'nodeName': u'path',
'ownerDocument': <xml.dom.minidom.Document instance at 0x97082ec>,
'parentNode': <DOM Element: paths at 0x97086cc>,
'prefix': None,
'previousSibling': <DOM Text node "u'\n\n'">,
'tagName': u'path'}
最后我们知道我们想要什么:
In [67]: for elem in elements:
print elem, elem.childNodes[0].nodeValue, elem.getAttribute("kind"), elem.getAttribute("action")
....:
<DOM Element: path at 0x970874c> /branches/Patch_4_2_0_Branch/text.xml file M
<DOM Element: path at 0x97088ac> /branches/Patch_4_2_0_Branch dir M
我无法想象不以交互方式执行此操作。