python - 使用 ElementTree getpath() 动态获取 Xpath

Question

我需要编写一个动态函数，通过动态构建元素的 XPath，在 ATOM xml 的子树上查找元素。

为此，我写了这样的东西：

    tree = etree.parse(xmlFileUrl)
    e = etree.XPathEvaluator(tree, namespaces={'def':'http://www.w3.org/2005/Atom'})
    entries = e('//def:entry')
    for entry in entries:
        mypath = tree.getpath(entry) + "/category"
        category = e(mypath)

上面的代码找不到“类别”，因为 getpath() 返回一个没有命名空间的 XPath，而 XPathEvaluator e() 需要命名空间。

虽然我知道我可以使用路径并在对 XPathEvaluator 的调用中提供命名空间，但我想知道是否可以使用所有命名空间使 getpath() 返回“完全限定”路径，因为这在某些情况下很方便案例。

（这是我之前的问题的一个衍生问题：没有命名空间的 Python XpathEvaluator）

score 4 · Accepted Answer

基本上，使用标准 Python 的 xml.etree 库，需要一个不同的访问函数。为了达到这个范围，您可以构建一个修改版本的iter方法，如下所示：

def etree_iter_path(node, tag=None, path='.'):
    if tag == "*":
        tag = None
    if tag is None or node.tag == tag:
        yield node, path
    for child in node:
        _child_path = '%s/%s' % (path, child.tag)
        for child, child_path in etree_iter_path(child, tag, path=_child_path):
            yield child, child_path

然后你可以使用这个函数从根节点迭代树：

from xml.etree import ElementTree

xmldoc = ElementTree.parse(*path to xml file*)
for elem, path in etree_iter_path(xmldoc.getroot()):
    print(elem, path)

score 2 · Accepted Answer

从文档http://lxml.de/xpathxslt.html#the-xpath-class：

ElementTree 对象有一个方法getpath(element)，它返回一个结构化的、绝对的 XPath 表达式来查找该元素：

所以你的问题的答案是它getpath()不会返回一个“完全限定”的路径，因为否则函数会有一个参数，你只能保证返回的 xpath 表达式会找到你那个元素。

您也许可以结合 getpath 和 xpath（和 Xpath 类）来做您想做的事。

score 2 · Accepted Answer

您可以使用条目作为基节点来评估 XPath 表达式，而不是尝试从根构建完整路径：

tree = etree.parse(xmlFileUrl)
nsmap = {'def':'http://www.w3.org/2005/Atom'}
entries_expr = etree.XPath('//def:entry', namespaces=nsmap)
category_expr = etree.XPath('category')
for entry in entries_expr(tree):
    category = category_expr(entry)

如果性能不重要，您可以通过使用.xpath()元素上的方法而不是预编译表达式来简化代码：

tree = etree.parse(xmlFileUrl)
nsmap = {'def':'http://www.w3.org/2005/Atom'}
for entry in tree.xpath('//def:entry', namespaces=nsmap):
    category = entry.xpath('category')

python - 使用 ElementTree getpath() 动态获取 Xpath

3 回答 3

Related

Reference