1

我正在编写我的第一个 Python 脚本,使用 libxml2 从 XML 文件中检索数据。该文件如下所示:

<myGroups1>
<myGrpContents name="ABC" help="abc_help">
     <myGrpKeyword name="abc1" help="help1"/>
     <myGrpKeyword name="abc2" help="help2"/>
     <myGrpKeyword name="abc3" help="help3"/>
</myGrpContents>
</myGroups1>

文件中有许多类似的组。我的目的是获取属性“名称”和“帮助”并将它们以不同的格式放入另一个文件中。但我只能使用以下代码检索到 myGroups1 元素。

doc = libxml2.parseFile(cmmfilename)
root2 = doc.children
child = root2.children
while child is not None:
    if not child.isBlankNode():
        if child.type == "element":
            print "\t Element ", child.name, " with ", child.lsCountNode(), "child(ren)"
            print "\t and content ", repr(child.content)
    child = child.next

如何更深入地迭代元素并获取属性?对此的任何帮助将不胜感激。

4

2 回答 2

1

Python。如何使用 libxml2 获取属性值可能是您正在寻找的答案。

当遇到这样的问题时,当我出于某种原因不想阅读文档时,像这样以交互方式探索库会很有帮助——我建议你使用交互式 python repl(我喜欢 bpython)来尝试这个。这是我提出解决方案的会话:

>>> import libxml2
>>> xml = """<myGroups1>
... <myGrpContents name="ABC" help="abc_help">
...      <myGrpKeyword name="abc1" help="help1"/>
...      <myGrpKeyword name="abc2" help="help2"/>
...      <myGrpKeyword name="abc3" help="help3"/>
... </myGrpContents>
... </myGroups1>"""
>>> tree = libxml2.parseMemory(xml, len(xml)) # I found this method by looking through `dir(libxml2)`
>>> tree.children
<xmlNode (myGroups1) object at 0x10aba33b0>
>>> a = tree.children
>>> a
<xmlNode (myGroups1) object at 0x10a919ea8>
>>> a.children
<xmlNode (text) object at 0x10ab24368>
>>> a.properties
>>> b = a.children
>>> b.children
>>> b.properties
>>> b.next
<xmlNode (myGrpContents) object at 0x10a921290>
>>> b.next.content
'\n     \n     \n     \n'
>>> b.next.next.content
'\n'
>>> b.next.next.next.content
Traceback (most recent call last):
  File "<input>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'content'
>>> b.next.next.next
>>> b.next.properties
<xmlAttr (name) object at 0x10aba32d8>
>>> b.next.properties.children
<xmlNode (text) object at 0x10ab40f38>
>>> b.next.properties.children.content
'ABC'
>>> b.next.properties.children.name
'text'
>>> b.next.properties.next
<xmlAttr (help) object at 0x10ab40fc8>
>>> b.next.properties.next.name
'help'
>>> b.next.properties.next.content
'abc_help'
>>> list(tree)
[<xmlDoc (None) object at 0x10a921248>, <xmlNode (myGroups1) object at 0x10aba32d8>, <xmlNode (text) object at 0x10aba3878>, <xmlNode (myGrpContents) object at 0x10aba3d88>, <xmlNode (text) object at 0x10aba3950>, <xmlNode (myGrpKeyword) object at 0x10aba3758>, <xmlNode (text) object at 0x10aba3320>, <xmlNode (myGrpKeyword) object at 0x10aba3f38>, <xmlNode (text) object at 0x10aba3560>, <xmlNode (myGrpKeyword) object at 0x10aba3998>, <xmlNode (text) object at 0x10aba33f8>, <xmlNode (text) object at 0x10aba38c0>]
>>> good = list(tree)[5]
>>> good.properties
<xmlAttr (name) object at 0x10aba35f0>
>>> good.prop('name')
'abc1'
>>> good.prop('help')
'help1'
>>> good.prop('whoops')
>>> good.hasProp('whoops')
>>> good.hasProp('name')
<xmlAttr (name) object at 0x10ab40ef0>
>>> good.hasProp('name').content
'abc1'
>>> for thing in tree:
...     if thing.hasProp('name') and thing.hasProp('help'):
...         print thing.prop('name'), thing.prop('help')
...         
...     
... 
ABC abc_help
abc1 help1
abc2 help2
abc3 help3

因为它是bpython,所以我作弊了一点——有一个倒带键,所以我打错的不止这个,但除此之外,这非常接近。

于 2013-09-13T08:36:22.807 回答
1

没用过libxml2,但潜入案例发现了这个,

试试,

if child.type == "element":
    if child.name == "myGrpKeyword":
        print child.prop('name')
        print child.prop('help')

或者

if child.type == "element":
    if child.name == "myGrpKeyword":
        for property in child.properties:
            if property.type=='attribute':
                # check what is the attribute 
                if property.name == 'name':
                    print property.content
                if property.name == 'help':
                    print property.content

参考http://ukchill.com/technology/getting-started-with-libxml2-and-python-part-1/

更新:

尝试递归函数

def explore(child):     
    while child is not None:
        if not child.isBlankNode():
            if child.type == "element":
                print element.prop('name')
                print element.prop('help')
                explore(child.children)
        child = child.next
doc = libxml2.parseFile(cmmfilename)
root2 = doc.children
child = root2.children
explore(child)
于 2013-09-13T06:25:25.680 回答