Node.getTextContent() 返回当前节点及其后代的文本内容。
有没有办法获取当前节点的文本内容,而不是后代的文本。
例子
<paragraph>
<link>XML</link>
is a
<strong>browser based XML editor</strong>
editor allows users to edit XML data in an intuitive word processor.
</paragraph>
预期产出
paragraph = is a editor allows users to edit XML data in an intuitive word processor.
link = XML
strong = browser based XML editor
我试过下面的代码
String str = "<paragraph>"+
"<link>XML</link>"+
" is a "+
"<strong>browser based XML editor</strong>"+
"editor allows users to edit XML data in an intuitive word processor."+
"</paragraph>";
org.w3c.dom.Document domDoc = null;
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder;
try {
docBuilder = docFactory.newDocumentBuilder();
ByteArrayInputStream bis = new ByteArrayInputStream(str.getBytes());
domDoc = docBuilder.parse(bis);
} catch (ParserConfigurationException e1) {
e1.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
DocumentTraversal traversal = (DocumentTraversal) domDoc;
NodeIterator iterator = traversal.createNodeIterator(
domDoc.getDocumentElement(), NodeFilter.SHOW_ELEMENT, null, true);
for (Node n = iterator.nextNode(); n != null; n = iterator.nextNode()) {
String tagname = ((Element) n).getTagName();
System.out.println(tagname + "=" + ((Element)n).getTextContent());
}
但它给出了这样的输出
paragraph=XML is a browser based XML editoreditor allows users to edit XML data in an intuitive word processor.
link=XML
strong=browser based XML editor
请注意,段落元素包含我不想要的链接文本和强标记。请提出一些想法?