1

我有以下 xml .. 我正在尝试解析它。

<employee>
    <personal>
        <id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>
        <name>Lareina</name>
        <age>50</age>
    </personal>
    <contact>
        <dept>Fusce</dept>
        <manager>CB9A0BB76</manager>
    </contact>
</employee>

但是..好吧...我不能这样做..发布我的代码..但是我的代码适用于“正确”格式的xml吗?(取消注释“xmlString”)

public class XMLReader {
    public static void main(String[] args) throws JDOMException, IOException {

        //String xmlString = "<employee >\n <firstname xml:space=\"preserve\" >John</firstname>\n <lastname>Watson</lastname>\n <age>30</age>\n <email>johnwatson@sh.com</email>\n</employee>";
        String xmlString = "<employee>\n" + 
                "       <personal><id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>\n" + 
                "       <name>Lareina</name>\n" + 
                "       <age>50</age>\n" + 
                "       </personal><contact><dept>Fusce</dept>\n" + 
                "       <manager>B55E6DA8-76BD-A3C8-2DDF-686CB9A0BB76</manager></contact>\n" + 
                "   </employee>";
        System.out.println(xmlString);


        SAXBuilder builder = new SAXBuilder();
        Reader in = new StringReader(xmlString);

        Document doc = builder.build(in);
        Element root = doc.getRootElement();
        List children = root.getChildren();
        //System.out.println(children);
        String value = "";
        for (int i = 0; i < children.size(); i++) {

                Element dataNode = (Element) children.get(i);
               // Element dataNode = (Element) dataNodes.get(j);
                value += ", " +dataNode.getText().trim();
                System.out.println(dataNode.getName() + " : " + dataNode.getText());

                //context.write(new Text(rowKey.toString()), new Text(node.getName().trim() + " " + node.getText().trim()));

            }
        //System.out.println(in);



    }
}
4

1 回答 1

2

你的两个 xml 字符串是不同的。第一个是

<employee>
    <firstname xml:space="preserve">John</firstname>
    <lastname>Watson</lastname>
    <age>30</age>
    <email>johnwatson@sh.com</email>
</employee>

其中有四 (4) 个孩子,每个孩子都有文本。所以它打印

firstname : John
lastname : Watson
age : 30
email : johnwatson@sh.com

第二个是

<employee>
    <personal>
        <id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>
        <name>Lareina</name>
        <age>50</age>
    </personal>
    <contact>
        <dept>Fusce</dept>
        <manager>B55E6DA8-76BD-A3C8-2DDF-686CB9A0BB76</manager>
    </contact>
</employee>

在最后一个中,您有两个personal没有contact文本的孩子。所以你得到像

personal : 



contact : 

这是预期的输出。

于 2013-09-26T22:38:53.207 回答