java - Java - Android SDK 8 - 使用 DocumentBuilderFactory 在实体之后终止字符串的 XML 解析

Question

我正在用 Java for Android (SDK v8) 编写一个应用程序，它解析 XML 并将条目放入 ListView。这部分工作正常。我正在使用 DocumentBuilder 解析 XML，它在实体之后终止它输出的字符串 - 不包括实体本身。我使用的实体是标准实体 &(quot, amp, apos, lt, gt); 我还尝试在我的源 XML 中使用数字实体（例如 38; 没有空格，这样你就可以看到我正在输出的内容），这导致我的应用程序崩溃，logcat 报告“未终止的实体引用” .

为了测试我没有使用无效的 XML，我尝试使用谷歌浏览器查看 XML——它完美地显示了它。条目blah & blah.txt被截断为blah。我正在解析的 XML 如下：

编辑：更短的 XML 示例

<?xml version="1.1"?>  
<root>  
<object>  
<id>ROOT</id>  
<type>directory</type>  
<name>../</name>  
</object>  
<object>  
<id>09F010C143B84573A36C50F3EF7E0708</id>  
<type>file</type>  
<name>blah &amp; blah.txt</name>  
</object>   
<object>  
<id>85CF028B838D4E0096C081B987C97045</id>  
<type>file</type>  
<name>Epilist.m3u</name>  
</object>  
</root>

编辑：XML 解析类 EDIT2：下面是一个完整的类，（在其他人的帮助下）现在应该没有错误。欢迎任何人使用此类 - 我将其作为公共领域代码提供。您无需引用我最初生成此代码即可使用它。它是为 Android 设计的，但通过替换对“Log.e”的引用，据我所知，它可以很容易地在任何 Java 平台上使用。

package tk.dtechsoftware.mpclient;

import java.io.IOException;
import java.io.StringReader;
import java.io.UnsupportedEncodingException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.util.EntityUtils;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

import android.util.Log;

public class XMLParser {
    public String getXmlFromUrl(String url) {
        String xml = null;

        try {
            // defaultHttpClient
            DefaultHttpClient httpClient = new DefaultHttpClient();
            HttpGet httpGet = new HttpGet(url);

            // HttpResponse httpResponse = httpClient.execute(httpPost);
            HttpResponse httpResponse = httpClient.execute(httpGet);
            HttpEntity httpEntity = httpResponse.getEntity();
            xml = EntityUtils.toString(httpEntity);

        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        // return XML
        return xml;
    }

    public Document getDomElement(String xml) {
        Document doc = null;
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setExpandEntityReferences(false);
        try {

            DocumentBuilder db = dbf.newDocumentBuilder();

            InputSource is = new InputSource();
            is.setCharacterStream(new StringReader(xml));
            doc = db.parse(is);

        } catch (ParserConfigurationException e) {
            Log.e("Error: ", e.getMessage());
            return null;
        } catch (SAXException e) {
            Log.e("Error: ", e.getMessage());
            return null;
        } catch (IOException e) {
            Log.e("Error: ", e.getMessage());
            return null;
        }
        // return DOM
        return doc;
    }

    public String getValue(Element item, String str) {
        NodeList n = item.getElementsByTagName(str);
        return n.item(0).getTextContent(); 
    }

}

score 1 · Accepted Answer

我认为它不能保证元素节点只有一个包含其文本内容的子节点。内容也可以跨多个子节点拆分。

您的getElementValue方法可能可以通过对elem.getTextContent().

java - Java - Android SDK 8 - 使用 DocumentBuilderFactory 在实体之后终止字符串的 XML 解析

1 回答 1

Related

Reference