java - jdom2 XPath 查询的结果不明确

Question

我对 jdom2 XPath 有疑问：

test.xhtml 代码：

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="cs" lang="cs">
<head>
<title>mypage</title>
</head>
<body>
<div class="in">
<a class="nextpage" href="url.html">
<img src="img/url.gif" alt="to url.html" />
</a>
</div>
</body>
</html>

Java代码：

Document document;
SAXBuilder saxBuilder = new SAXBuilder();

document = saxBuilder.build("test2.html");
XPathFactory xpfac = XPathFactory.instance();
XPathExpression<Element> xp = xpfac.compile("//a[@class = 'nextpage']", Filters.element());
for (Element att : xp.evaluate(document) ) {
  System.out.println("We have target " + att.getAttributeValue("href"));
}

但就这个我无法获得任何元素。我发现当 query 是时//*[@class = 'nextpage']，它会找到它。

We have target url.html

它必须是带有名称空间的东西或标题中的任何其他东西，因为没有它它可以生成一些输出。我不知道我做错了什么。

score 0 · Accepted Answer

注意：尽管这与建议的副本中描述的问题相同，但其他问题与 JDOM 版本 1.x 有关。在 JDOM 2.x 中存在许多显着差异。这个答案与显着不同的 JDOM 2.x XPath 实现有关。

XPath 规范非常清楚在 XPath 表达式中如何处理名称空间。不幸的是，对于熟悉 XML 的人来说，命名空间的 XPath 处理与他们的预期略有不同。这是规范：

使用来自表达式上下文的命名空间声明将节点测试中的 QName 扩展为扩展名称。这与对开始和结束标记中的元素类型名称进行扩展的方式相同，只是不使用使用 xmlns 声明的默认命名空间：如果 QName 没有前缀，则命名空间 URI 为空（这是相同的方式属性名称被扩展）。如果 QName 具有在表达式上下文中没有名称空间声明的前缀，则这是一个错误。

实际上，这意味着，只要您在 XML 文档中有一个“默认”名称空间，在 XPath 表达式中使用它时，您仍然需要为该名称空间添加前缀。XPathFactory.compile(...) 方法在 JavaDoc 中暗示了这一要求，但它并不像应有的那样清晰。您使用的前缀是任意的，并且仅限于该 XPath 表达式。在您的情况下，代码将类似于（假设我们xhtml为 URI选择命名空间http://www.w3.org/1999/xhtml）：

XPathFactory xpfac = XPathFactory.instance();
Namespace xhtml = Namespace.getNamespace("xhtml", "http://www.w3.org/1999/xhtml");
XPathExpression<Element> xp = xpfac.compile("//xhtml:a[@class = 'nextpage']", Filters.element(), null, xhtml);
for (Element att : xp.evaluate(document) ) {
    System.out.println("We have target " + att.getAttributeValue("href"));
}

我应该将此添加到常见问题解答中...谢谢。

java - jdom2 XPath 查询的结果不明确

1 回答 1

Related

Reference