2

I am taking an XML file as an input (the data in file is like an index page of book, with chapter names and some other information), and I use my code to retrieve few values from it. The three values that I get from the file are:

Title (a long string of title to the chapter)
Number (chapter number)
ID (This is an ID associated with chapter, format: xxx-yy-zzz)

What I need to do is store these values in 5 different columns in an Excel sheet (by splitting the ID around the hyphens, where each part of ID is different sub-ID).

So, I iterate over the file, get the Title, Number and ID, and concatenate them together with "-" in between such that it looks like a String of format

Title-Number-SubID1-SubID2-SubID3

and I add each of these strings to a list, which I later iterate, split from "-" and get each of the 5 values and write to the Excel sheet.

My file has 113 unique occurrences, but I just notices that in my Excel sheet I only get 103 unique occurrences and 10 values are duplicates. And somehow, 10 values that are supposed to be in there are not in the sheet. I am really confused about what's happening.

EDIT:

This is where I get the string for each ID I send in along with the XML document.

 public static String getBooksFromDoc(Document doc, String id)
        throws Exception {
    String idset = null;
    String title = null;
    String num = null;
    doc.getDocumentElement().normalize();
    XPath xPath = XPathFactory.newInstance().newXPath();
    XPathExpression xPathExpr = (XPathExpression) xPath
            .compile("//document[@id ='" + id + "']");
    NodeList nlist = (NodeList) xPathExpr.evaluate(doc,
            XPathConstants.NODESET);
    for (int i = 0; i < nlist.getLength(); i++) {
        rulebookProp = new RulebookProperties();
        Node nnode = nlist.item(i);
        XPathExpression xPath1 = (XPathExpression) xPath
                .compile(".//idset");
        Element eelement = (Element) nnode;
        Node idNode = (Node) xPath1.evaluate(eelement, XPathConstants.NODE);
        idset = idNode.getFirstChild().getNodeValue();

        XPathExpression xPath2 = (XPathExpression) xPath
                .compile(".//title");
        Element eelement1 = (Element) nnode;
        Node idNode1 = (Node) xPath2.evaluate(eelement1,
                XPathConstants.NODE);
        if (idNode1 == null) {
            title = " ";
        } else {
            title = idNode1.getFirstChild().getNodeValue();
        }

        XPathExpression xPath3 = (XPathExpression) xPath
                .compile(".//number");
        Element eelement2 = (Element) nnode;
        Node idNode2 = (Node) xPath3.evaluate(eelement2,
                XPathConstants.NODE);
        if (idNode2 == null) {
            num = " ";
        } else {
            num = idNode2.getFirstChild().getNodeValue();
        }
    }
    return title + "-" + num + "-" + idset;
}

I add each of the strings returned to a list.

List<String> books = new ArrayList<String>();

books.add(getBooksFromDoc(xmlDoc, id);

This is the method where I use the list to get the 5 values. (Note: In some occurences the ID looks like xxx or xxx-yyy or xxx-yyy-zzz i.e. it could be made up of three parts or one. (which explains the conditions in my code))

public static List<BookObject> getBookEntries(
        List<String> books) {
    String bookTitle = " ";
    String bookID = " ";
    String bookElementID = " ";
    String recordID = " ";
    String bookNo = " ";


    for String book : books) {

        String[] parts = book.split("-");
        if (parts.length == 5) {
            for (int i = 0; i < parts.length; i++) {
                bookTitle = parts[0]
                bookNo = parts[1]
                bookID = parts[2];
                bookElementID = parts[3];
                recordID = parts[4];
                bookObj = new BookObject();
                bookObj.setBookTitle(bookTitle);
                bookObj.setBookNo(bookNo);
                bookObj.setBookId(bookID);
                bookObj.setBookElementId(bookElementID);
                bookObj.setRecordId(recordID);
            }
        } else if (parts.length == 4) {
            for (int i = 0; i < parts.length; i++) {
                bookTitle = parts[0]
                bookNo = parts[1]
                bookID = parts[2];
                bookElementID = parts[3];
                bookObj = new BookObject();
                bookObj.setBookTitle(bookTitle);
                bookObj.setBookNo(bookNo);
                bookObj.setBookId(bookID);
                bookObj.setBookElementId(bookElementID);
                bookObj.setRecordId(recordID);
            }
        } else if (ids.length == 1) {
            for (int i = 0; i < parts.length; i++) {
                bookTitle = parts[0]
                bookNo = parts[1]
                bookID = parts[2];
                bookObj = new BookObject();
                bookObj.setBookTitle(bookTitle);
                bookObj.setBookNo(bookNo);
                bookObj.setBookId(bookID);
                bookObj.setBookElementId(bookElementID);
                bookObj.setRecordId(recordID);
            }       
        }
        bookEntries.add(bookObj);
    }
    return bookEntries;
}

Later I just iterate over each bookEntries and add to Excel sheet. (I hope this made it a little clear.)

for (int i = 0; i < listEntries.size(); i++) {
            Row dataRow = sheet.createRow(i+1);
            dataRow.createCell(0).setCellValue(
                    bookEntries.get(i).getBookTitle());
            dataRow.createCell(1).setCellValue(
                    bookEntries.get(i).getBookId());
            dataRow.createCell(2).setCellValue(
                    bookEntries.get(i).getBookElementId());
            dataRow.createCell(3).setCellValue(
                    bookEntries.get(i).getRecordId());
            dataRow.createCell(4).setCellValue(
                    bookEntries.get(i).getBookNo());
}
4

1 回答 1

2

我找到了答案。这些缺失条目的原因是来自 XML 的数据格式。一些条目 (10) 的标题中有连字符。我错误地没有考虑到这一点,因为大多数条目的名称中没有任何连字符。因此,这导致将这些字符串分成 6 个部分,而我的代码没有处理这些部分。我以为它最多分为5个部分。我现在已经修复了,它工作正常:)

于 2013-09-06T20:01:15.833 回答