java - 按名称仅获取 XML 直接子元素

Question

我的问题是：当存在与父元素的“孙子”同名的其他元素时，如何直接在特定父元素下获取元素。

我正在使用 Java DOM 库来解析 XML元素，但遇到了麻烦。这是我正在使用的一些（一小部分）xml：

<notifications>
  <notification>
    <groups>
      <group name="zip-group.zip" zip="true">
        <file location="C:\valid\directory\" />
        <file location="C:\another\valid\file.doc" />
        <file location="C:\valid\file\here.txt" />
      </group>
    </groups>
    <file location="C:\valid\file.txt" />
    <file location="C:\valid\file.xml" />
    <file location="C:\valid\file.doc" />
  </notification>
</notifications>

如您所见，有两个地方可以放置<file>元素。无论是在组内还是在组外。我真的希望它以这种方式构建，因为它对用户更友好。

现在，每当我调用notificationElement.getElementsByTagName("file");它时，它都会给我所有<file>元素，包括元素下的<group>元素。我以不同的方式处理这些文件中的每一种，所以这个功能是不可取的。

我想到了两种解决方案：

<notification>获取文件元素的父元素并进行相应的处理（取决于是<group>.
重命名第二个<file>元素以避免混淆。

这些解决方案都不像只是让事物保持原样并仅获取<file>元素的直接子<notification>元素那样可取。

我对IMPO的评论和关于“最佳”方法的回答持开放态度，但我对DOM解决方案非常感兴趣，因为这是该项目的其余部分正在使用的。谢谢。

score 23 · Accepted Answer

我意识到您在 5 月找到了解决此问题的方法@kentcdodds，但我现在发现了一个非常相似的问题，我认为（可能在我的用例中，但不是在您的用例中），一个解决方案。

我的 XML 格式的一个非常简单的示例如下所示：-

<?xml version="1.0" encoding="utf-8"?>
<rels>
    <relationship num="1">
        <relationship num="2">
            <relationship num="2.1"/>
            <relationship num="2.2"/>
        </relationship>
    </relationship>
    <relationship num="1.1"/>
    <relationship num="1.2"/>

</rels>

正如您希望从这个片段中看到的那样，我想要的格式可以对 [relationship] 节点进行 N 级嵌套，所以很明显，我使用 Node.getChildNodes() 遇到的问题是我从所有级别获取所有节点层次结构，并且没有任何关于节点深度的提示。

看了一会儿API，我注意到实际上还有另外两种可能有用的方法：-

总之，这两种方法似乎提供了获取节点的所有直接后代元素所需的一切。下面的 jsp 代码应该给出一个关于如何实现它的相当基本的想法。对不起JSP。我现在将它滚动到一个 bean 中，但没有时间从挑选出来的代码创建一个完全工作的版本。

<%@page import="javax.xml.parsers.DocumentBuilderFactory,
                javax.xml.parsers.DocumentBuilder,
                org.w3c.dom.Document,
                org.w3c.dom.NodeList,
                org.w3c.dom.Node,
                org.w3c.dom.Element,
                java.io.File" %><% 
try {

    File fXmlFile = new File(application.getRealPath("/") + "/utils/forms-testbench/dom-test/test.xml");
    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
    Document doc = dBuilder.parse(fXmlFile);
    doc.getDocumentElement().normalize();

    Element docEl = doc.getDocumentElement();       
    Node childNode = docEl.getFirstChild();     
    while( childNode.getNextSibling()!=null ){          
        childNode = childNode.getNextSibling();         
        if (childNode.getNodeType() == Node.ELEMENT_NODE) {         
            Element childElement = (Element) childNode;             
            out.println("NODE num:-" + childElement.getAttribute("num") + "<br/>\n" );          
        }       
    }

} catch (Exception e) {
    out.println("ERROR:- " + e.toString() + "<br/>\n");
}

%>

此代码将给出以下输出，仅显示初始根节点的直接子元素。

NODE num:-1
NODE num:-1.1
NODE num:-1.2

希望这对某人有所帮助。为最初的帖子干杯。

score 15 · Accepted Answer

您可以为此使用 XPath，使用两条路径来获取它们并以不同方式处理它们。

获取<file>节点的直接子节点<notification>use//notification/file和那些正在<group>使用的节点//groups/group/file。

这是一个简单的示例：

public class SO10689900 {
    public static void main(String[] args) throws Exception {
        DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        Document doc = db.parse(new InputSource(new StringReader("<notifications>\n" + 
                "  <notification>\n" + 
                "    <groups>\n" + 
                "      <group name=\"zip-group.zip\" zip=\"true\">\n" + 
                "        <file location=\"C:\\valid\\directory\\\" />\n" + 
                "        <file location=\"C:\\this\\file\\doesn't\\exist.grr\" />\n" + 
                "        <file location=\"C:\\valid\\file\\here.txt\" />\n" + 
                "      </group>\n" + 
                "    </groups>\n" + 
                "    <file location=\"C:\\valid\\file.txt\" />\n" + 
                "    <file location=\"C:\\valid\\file.xml\" />\n" + 
                "    <file location=\"C:\\valid\\file.doc\" />\n" + 
                "  </notification>\n" + 
                "</notifications>")));
        XPath xpath = XPathFactory.newInstance().newXPath();
        XPathExpression expr1 = xpath.compile("//notification/file");
        NodeList nodes = (NodeList)expr1.evaluate(doc, XPathConstants.NODESET);
        System.out.println("Files in //notification");
        printFiles(nodes);

        XPathExpression expr2 = xpath.compile("//groups/group/file");
        NodeList nodes2 = (NodeList)expr2.evaluate(doc, XPathConstants.NODESET);
        System.out.println("Files in //groups/group");
        printFiles(nodes2);
    }

    public static void printFiles(NodeList nodes) {
        for (int i = 0; i < nodes.getLength(); ++i) {
            Node file = nodes.item(i);
            System.out.println(file.getAttributes().getNamedItem("location"));
        }
    }
}

它应该输出：

Files in //notification
location="C:\valid\file.txt"
location="C:\valid\file.xml"
location="C:\valid\file.doc"
Files in //groups/group
location="C:\valid\directory\"
location="C:\this\file\doesn't\exist.grr"
location="C:\valid\file\here.txt"

score 14 · Accepted Answer

嗯，这个问题的 DOM 解决方案实际上非常简单，即使它不是太优雅。

当我遍历filesNodeList调用时返回的时notificationElement.getElementsByTagName("file")，我只是检查父节点的名称是否为“通知”。如果不是，那么我将忽略它，因为这将由<group>元素处理。这是我的代码解决方案：

for (int j = 0; j < filesNodeList.getLength(); j++) {
  Element fileElement = (Element) filesNodeList.item(j);
  if (!fileElement.getParentNode().getNodeName().equals("notification")) {
    continue;
  }
  ...
}

score 5 · Accepted Answer

如果你坚持使用 DOM API

NodeList nodeList = doc.getElementsByTagName("notification")
    .item(0).getChildNodes();

// get the immediate child (1st generation)
for (int i = 0; i < nodeList.getLength(); i++)
    switch (nodeList.item(i).getNodeType()) {
        case Node.ELEMENT_NODE:

            Element element = (Element) nodeList.item(i);
            System.out.println("element name: " + element.getNodeName());
            // check the element name
            if (element.getNodeName().equalsIgnoreCase("file"))
            {

                // do something with you "file" element (child first generation)

                System.out.println("element name: "
                    + element.getNodeName() + " attribute: "
                    + element.getAttribute("location"));

            }
    break;

}

我们的第一个任务是获取元素“Notification”（在本例中为第一个 -item (0)-）及其所有子元素：

NodeList nodeList = doc.getElementsByTagName("notification")
    .item(0).getChildNodes();

（稍后您可以使用获取所有元素来处理所有元素）。

对于“通知”的每个孩子：

for (int i = 0; i < nodeList.getLength(); i++)

您首先获取它的类型以查看它是否是一个元素：

switch (nodeList.item(i).getNodeType()) {
    case Node.ELEMENT_NODE:
        //.......
        break;  
}

如果是这样，那么你得到了你的孩子的“文件”，而不是孙子的“通知”

您可以查看它们：

if (element.getNodeName().equalsIgnoreCase("file"))
{

    // do something with you "file" element (child first generation)

    System.out.println("element name:"
        + element.getNodeName() + " attribute: "
        + element.getAttribute("location"));

}

输出是：

element name: file
element name:file attribute: C:\valid\file.txt
element name: file
element name:file attribute: C:\valid\file.xml
element name: file
element name:file attribute: C:\valid\file.doc

score 4 · Accepted Answer

我在我的一个项目中遇到了同样的问题，并编写了一个小函数，它将返回一个List<Element>只包含直系子级的函数。基本上它检查返回的每个节点getElementsByTagName是否它的 parentNode 实际上是我们正在搜索子节点的节点：

public static List<Element> getDirectChildsByTag(Element el, String sTagName) {
        NodeList allChilds = el.getElementsByTagName(sTagName);
        List<Element> res = new ArrayList<>();

        for (int i = 0; i < allChilds.getLength(); i++) {
            if (allChilds.item(i).getParentNode().equals(el))
                res.add((Element) allChilds.item(i));
        }

        return res;
    }

如果有一个名为“通知”的子节点，kentcdodds 接受的答案将返回错误的结果（例如孙子） - 例如，当元素“组”的名称为“通知”时返回孙子。我在我的项目中遇到了这种设置，这就是我想出我的功能的原因。

score 0 · Accepted Answer

我遇到了一个相关问题，即使所有“文件”节点的处理方式相似，我也只需要处理直接子节点。对于我的解决方案，我将元素的父节点与正在处理的节点进行比较，以确定元素是否是直接子节点。

NodeList fileNodes = parentNode.getElementsByTagName("file");
for(int i = 0; i < fileNodes.getLength(); i++){
            if(parentNode.equals(fileNodes.item(i).getParentNode())){
                if (fileNodes.item(i).getNodeType() == Node.ELEMENT_NODE) {

                    //process the child node...
                }
            }
        }

score 0 · Accepted Answer

我写了这个函数来通过tagName获取节点值，限制到顶层

public static String getValue(Element item, String tagToGet, String parentTagName) {
    NodeList n = item.getElementsByTagName(tagToGet);
    Node nodeToGet = null;
    for (int i = 0; i<n.getLength(); i++) {
        if (n.item(i).getParentNode().getNodeName().equalsIgnoreCase(parentTagName)) {
            nodeToGet = n.item(i);
        }
    }
    return getElementValue(nodeToGet);
}

public final static String getElementValue(Node elem) {
    Node child;
    if (elem != null) {
        if (elem.hasChildNodes()) {
            for (child = elem.getFirstChild(); child != null; child = child
                    .getNextSibling()) {
                if (child.getNodeType() == Node.TEXT_NODE) {
                    return child.getNodeValue();
                }
            }
        }
    }
    return "";
}

score 0 · Accepted Answer

我最终在 Kotlin 中创建了一个扩展函数来做到这一点

fun Element.childrenWithTagName(name: String): List<Node> = childNodes
    .asList()
    .filter { it.nodeName == name }

调用者可以像这样使用它：

val meta = target.newChildElement("meta-coverage")
source.childrenWithTagName("counter").forEach {
    meta.copyElementWithAttributes(it)
}

作为列表实现：


fun NodeList.asList(): List<Node> = InternalNodeList(this)

private class InternalNodeList(
    private val list: NodeList,
    override val size: Int = list.length
) : RandomAccess, AbstractList<Node>() {
    override fun get(index: Int): Node = list.item(index)
}

score 0 · Accepted Answer

有一个不错的 LINQ 解决方案：

For Each child As XmlElement In From cn As XmlNode In xe.ChildNodes Where cn.Name = "file"
    ...
Next

java - 按名称仅获取 XML 直接子元素

9 回答 9

Related

Reference