我正在尝试将 TagSoup 与 XPath (JAXP) 一起使用。我知道如何从 TagSoup(或 XMLReader)获取 SAX 解析器。但是我找不到如何创建将使用该 SAX 解析器的 DocumentBuilder。我怎么做?
谢谢你。
编辑:抱歉这么笼统,但 Java XML API 实在是太痛苦了。
编辑2:
问题解决了:
public static void main(String[] args) throws XPathExpressionException, IOException,
SAXNotRecognizedException, SAXNotSupportedException,
TransformerFactoryConfigurationError, TransformerException {
XPathFactory xpathFac = XPathFactory.newInstance();
XPath xpath = xpathFac.newXPath();
InputStream input = new FileInputStream("/tmp/g.html");
XMLReader reader = new Parser();
reader.setFeature(Parser.namespacesFeature, false);
Transformer transformer = TransformerFactory.newInstance().newTransformer();
DOMResult result = new DOMResult();
transformer.transform(new SAXSource(reader, new InputSource(input)), result);
Node htmlNode = result.getNode();
NodeList nodes = (NodeList) xpath.evaluate("//span", htmlNode, XPathConstants.NODESET);
System.out.println(nodes.getLength());
}
编辑3:
帮助我的链接:http: //www.jezuk.co.uk/cgi-bin/view/jez?id=2643