你可以通过打印一个 DOM 树得到一个很好的主意:
public static void main(String[] args) throws UnsupportedEncodingException, IOException, ParserConfigurationException, SAXException {
final String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\" ?>"
+ "<people>"
+ " <!-- a comment -->"
+ " <student>"
+ " <name>John</name>"
+ " <!-- a comment -->"
+ " <course>Computer Technology</course>"
+ " <semester>6</semester>"
+ " <scheme>E</scheme>"
+ " </student>"
+ ""
+ " <student>"
+ " <name>Foo</name>"
+ " <course>Industrial Electronics</course>"
+ " <semester>6</semester>"
+ " <scheme>E</scheme>"
+ " </student>"
+ "</people>";
final Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new ByteArrayInputStream(xml.getBytes()));
printNodes(document.getDocumentElement(), 0);
}
private static void printNodes(final Node node, final int depth) {
final StringBuilder prefix = new StringBuilder();
for (int i = 0; i < depth; ++i) {
prefix.append("\t");
}
if (node.getNodeType() == Node.ELEMENT_NODE) {
System.out.println(prefix.toString() + "Going into " + node.getNodeName());
final NodeList nodeList = node.getChildNodes();
for (int i = 0; i < nodeList.getLength(); ++i) {
printNodes(nodeList.item(i), depth + 1);
}
} else if (node.getNodeType() == Node.COMMENT_NODE) {
System.out.println(prefix.toString() + "Comment node: \"" + node.getTextContent() + "\"");
} else {
System.out.println(prefix.toString() + "Text node: \"" + node.getTextContent() + "\"");
}
}
这个的输出是:
Going into people
Text node: " "
Comment node: " a comment "
Text node: " "
Going into student
Text node: " "
Going into name
Text node: "John"
Text node: " "
Comment node: " a comment "
Text node: " "
Going into course
Text node: "Computer Technology"
Text node: " "
Going into semester
Text node: "6"
Text node: " "
Going into scheme
Text node: "E"
Text node: " "
Text node: " "
Going into student
Text node: " "
Going into name
Text node: "Foo"
Text node: " "
Going into course
Text node: "Industrial Electronics"
Text node: " "
Going into semester
Text node: "6"
Text node: " "
Going into scheme
Text node: "E"
Text node: " "
如您所见,在可见节点之间到处都有文本节点。这是因为理论上您可以在子节点周围放置文本- 例如
<student>
some random text
<course>Computer</course>
some more text
</student>
所以 DOM 树需要考虑到这一点。如果 XML 打印得不是很漂亮,而只是一行,那么下一个节点将是空的,而不是充满空格。
摆弄文档,看看它对输出有什么影响。