1

我是 JAVA 的新手,我想读取文本文件并用 XML 编写它,这是我的输入:

  1. R.-J。Roe, J. Appl.Phys。36, 2024 (1965)。

和输出但是是:

        <ref id="1">
        <label>1</label>
        <citation-alternatives>
            <mixed-citation>R.-J. Roe, J. Appl.Phys. 36, 2024 (1965).</mixed-citation>
        </citation-alternatives>
    </ref>

在许多情况下,此输入分为两行,它们之间没有空格,如下所示:

  1. R.-J。鱼子,

    J.应用物理。36, 2024 (1965)。

输出将是这样的:

        <ref id="1">
        <label>1</label>
        <citation-alternatives>
            <mixed-citation>R.-J. Roe, </mixed-citation>
        </citation-alternatives>
    </ref>

    <ref id="1">
        <label>1</label>
        <citation-alternatives>
            <mixed-citation>J. Appl.Phys. 36, 2024 (1965).</mixed-citation>
        </citation-alternatives>
    </ref>

现在我的问题是我怎样才能把这两行读成一个像第一个输出一样的东西?这是我的代码:

try {
            String strLine;
            String num="";
            String mix="";
            DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
            DocumentBuilder docBuilder = docFactory.newDocumentBuilder();

            // Back element
            Document doc = docBuilder.newDocument();
            Element rootElement = doc.createElement("Back");
            doc.appendChild(rootElement);

            // ref-list element
            Element reflist = doc.createElement("ref-list");
            rootElement.appendChild(reflist);

            while( (strLine = br.readLine()) != null)                   
                
            {                       
                if (strLine.equals("")) {
                    continue;
                }
                int dotIndex = strLine.indexOf(".");

                num = strLine.substring(0,dotIndex);
                mix = strLine.substring(dotIndex+2,strLine.length());



                // ref element
                Element ref= doc.createElement("ref");
                reflist.appendChild(ref);

                // set attribute of ref element
                Attr attr = doc.createAttribute("id");
                attr.setValue(num);
                ref.setAttributeNode(attr);

                // label element
                Element label = doc.createElement("label");
                ref.appendChild(label);
                label.setTextContent(num);

                // citation-alternatives element
                Element citationalternatives = doc.createElement("citation-alternatives");
                ref.appendChild(citationalternatives);

                // mixed-citation element
                Element mixedcitation = doc.createElement("mixed-citation");
                citationalternatives.appendChild(mixedcitation);
                mixedcitation.setTextContent(mix);
            }
4

2 回答 2

1

在将 strLine 插入元素之前,检查是否 strLine.endsWith( "," ),如果是,则读取下一行(依此类推)并附加到第一条 strLine。

于 2013-03-23T13:15:05.467 回答
0

您的代码<ref>在读取包含额外换行符的记录时创建两条记录的原因是您使用换行符来定义记录的开始时间。

您需要明确定义记录开始的标志。

例如,也许所有记录都以数字开头,后跟句点。也许它更可预测:它们都以序号开头,后跟一个句点。利用此逻辑,我们可以根据条件移动您创建新元素:

    Element ref= doc.createElement("ref");
    while( (strLine = br.readLine()) != null) {                       
        if (strLine.equals(""))
            continue;
        int dotIndex = strLine.indexOf(".");
        num = strLine.substring(0,dotIndex);
        mix = strLine.substring(dotIndex+2,strLine.length());
        if(refStart(strLine)) {
            ref= doc.createElement("ref");
            reflist.appendChild(ref);
        }
        //now decide how to parse the input - maybe it will be different depending on 
        //whether the line we just read starts a new record or continues one from
        //the previous line.
    }


    public boolean refStart(String line) {
        if(line.length()<2) 
            return false;
        int dotIndex = strLine.indexOf(".");
        if(dotIndex<=0 || dotIndex>5) //assuming largest value is 99999
            return false;
        String numString = strLine.substring(0,dotIndex);
        for(int i=0; i<numString.length(); i++) {
            if(!Character.isDigit(numString.charAt(i) )
               return false;
        }
        return true;
    }
于 2013-03-23T13:28:32.533 回答