1

我的问题围绕着我正在使用的 XML 文件,它看起来像这样。

<log>
<logentry
 revision="33185">
<author>glv</author>
<date>2012-08-06T21:01:52.494219Z</date>
<paths>

<path
 kind="file"
 action="M">/branches/Patch_4_2_0_Branch/text.xml</path>   

<path
 kind="dir"
 action="M">/branches/Patch_4_2_0_Branch</path>

</paths>
<msg>PATCH_BRANCH:N/A
BUG_NUMBER:N/A
FEATURE_AFFECTED:N/A
OVERVIEW:N/A
Adding the SVN log size requirement to the branch 
</msg>
  </logentry>
    </log>

现在我想做的是我想对此使用“if”语句来查看 xml 路径标记以检查它是 kind= dir 还是 kind = file。然后将路径添加到名为 content 的变量中。这就是我到目前为止所拥有的。我正在使用 dom.import 顺便说一句。

xmlPath = dom.getElementsByTagName('paths')[0]
xmlPathM =  xmlPath.getAttribute('kind')    
if xmlPathM == dir:
    content += "Directory location:" + xmlPathM +"\n \n"
else:
    content += "FileName"  + xmlPathM +"\n \n "

现在它似乎不想工作它会在其中打印出 FileName 而不是 Directory 位置。我相信,但我希望它查看这个日志并打印出来

Directory location: /branches/Patch_4_2_0_Branch 

FileName:/branches/Patch_4_2_0_Branch/text.xml

对于相同的日志条目。关于我错过了什么或做错了什么的任何想法?

4

1 回答 1

4

使用良好的交互式控制台(如 IPython)是一件很棒的事情。[旁注:我更喜欢 ElementTree,因为我更喜欢界面,但无论如何]

让我们深入研究。首先阅读它:

In [51]: import xml.dom.minidom as minidom

In [52]: dom = minidom.parse("log.xml")

In [53]: dom
Out[53]: <xml.dom.minidom.Document instance at 0x97082ec>

现在查看路径内部:

In [55]: dom.getElementsByTagName("paths")
Out[55]: [<DOM Element: paths at 0x97086cc>]

In [56]: dom.getElementsByTagName("paths")[0]
Out[56]: <DOM Element: paths at 0x97086cc>

In [57]: vars(dom.getElementsByTagName("paths")[0])
Out[57]: 
{'_attrs': {},
 '_attrsNS': {},
 'childNodes': [<DOM Text node "u'\n\n'">,
  <DOM Element: path at 0x970874c>,
  <DOM Text node "u'   \n\n'">,
  <DOM Element: path at 0x97088ac>,
  <DOM Text node "u'\n\n'">],
 'namespaceURI': None,
 'nextSibling': <DOM Text node "u'\n'">,
 'nodeName': u'paths',
 'ownerDocument': <xml.dom.minidom.Document instance at 0x97082ec>,
 'parentNode': <DOM Element: logentry at 0x970848c>,
 'prefix': None,
 'previousSibling': <DOM Text node "u'\n'">,
 'tagName': u'paths'}

看看childNodes

In [58]: dom.getElementsByTagName("paths")[0].childNodes
Out[58]: 
[<DOM Text node "u'\n\n'">,
 <DOM Element: path at 0x970874c>,
 <DOM Text node "u'   \n\n'">,
 <DOM Element: path at 0x97088ac>,
 <DOM Text node "u'\n\n'">]

空格很重要,所以这有点让人头疼。但是可以扔掉非元素:

In [61]: elements = [x for x in dom.getElementsByTagName("paths")[0].childNodes if isinstance(x, minidom.Element)]

In [62]: elements
Out[62]: [<DOM Element: path at 0x970874c>, <DOM Element: path at 0x97088ac>]

向内看:

In [65]: elements
Out[65]: [<DOM Element: path at 0x970874c>, <DOM Element: path at 0x97088ac>]

In [66]: vars(elements[0])
Out[66]: 
{'_attrs': {u'action': <xml.dom.minidom.Attr instance at 0x970880c>,
  u'kind': <xml.dom.minidom.Attr instance at 0x97087ac>},
 '_attrsNS': {(None, u'action'): <xml.dom.minidom.Attr instance at 0x970880c>,
  (None, u'kind'): <xml.dom.minidom.Attr instance at 0x97087ac>},
 'childNodes': [<DOM Text node "u'/branches/'...">],
 'namespaceURI': None,
 'nextSibling': <DOM Text node "u'   \n\n'">,
 'nodeName': u'path',
 'ownerDocument': <xml.dom.minidom.Document instance at 0x97082ec>,
 'parentNode': <DOM Element: paths at 0x97086cc>,
 'prefix': None,
 'previousSibling': <DOM Text node "u'\n\n'">,
 'tagName': u'path'}

最后我们知道我们想要什么:

In [67]: for elem in elements:
    print elem, elem.childNodes[0].nodeValue, elem.getAttribute("kind"), elem.getAttribute("action") 
   ....:     
<DOM Element: path at 0x970874c> /branches/Patch_4_2_0_Branch/text.xml file M
<DOM Element: path at 0x97088ac> /branches/Patch_4_2_0_Branch dir M

我无法想象不以交互方式执行此操作。

于 2012-08-27T20:57:39.010 回答