5

我是 android 新手,在我的应用程序中,我必须解析数据并且需要在屏幕中显示。但是在一个特定的标签数据中,我无法解析原因,因为该标签中也有一些特殊字符。下面是我显示我的代码。

我的解析器功能:

  protected ArrayList<String> doInBackground(Context... params) 
    {
//      context = params[0];
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();     
        test = new ArrayList<String>();
        try {
            DocumentBuilder builder = factory.newDocumentBuilder();
            Document document = builder.parse(new java.net.URL("input URL_confidential").openConnection().getInputStream());
            //Document document = builder.parse(new URL("http://www.gamestar.de/rss/gamestar.rss").openConnection().getInputStream());
            Element root = document.getDocumentElement();
            NodeList docItems = root.getElementsByTagName("item");
            Node nodeItem;
            for(int i = 0;i<docItems.getLength();i++)
            {
                nodeItem = docItems.item(i);
                if(nodeItem.getNodeType() == Node.ELEMENT_NODE)
                {
                    NodeList element = nodeItem.getChildNodes();                    
                    Element entry = (Element) docItems.item(i);
                    name=(element.item(0).getFirstChild().getNodeValue());




//                 System.out.println("description = "+element.item(2).getFirstChild().getNodeValue().replaceAll("&lt;div&gt;&lt;p&gt;"," "));
                    System.out.println("Description"+Jsoup.clean(org.apache.commons.lang3.StringEscapeUtils.unescapeHtml4(element.item(2).getFirstChild().getNodeValue()), new Whitelist()));             


                    items.add(name);


                }
            }
        } 
        catch (ParserConfigurationException e) 
        {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        catch (MalformedURLException e)
        {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        catch (SAXException e)
        {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        catch (IOException e)
        {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        return items;
    }

输入:

<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
<channel>
<title>my application</title>
<link>http:// some link</link>
<atom:link href="http:// XXXXXXXX" rel="self"></atom:link>
<language>en-us</language>
<lastBuildDate>Thu, 20 Dec 2012</lastBuildDate>
<item>
<title>lllegal settlements</title>
<link>http://XXXXXXXXXXXXXXXX</link>
<description> &lt;div&gt;&lt;p&gt;
India was joined by all members of the 15-nation UN Security Council except the US to condemn Israel’s announcement of new construction activity in Palestinian territories and demand immediate dismantling of the “illegal†settlements.
&lt;/p&gt;
&lt;p&gt;
UN Secretary General Ban Ki-moon also expressed his deep concern by the heightened settlement activity in West Bank, saying the move by Israel “gravely threatens efforts to establish a viable Palestinian state.â€
&lt;/p&gt;
&lt;p&gt;
</description>
</item>
</channel>

输出:

 lllegal settlements  ----> title tag text

     India was joined by all members of the 15-nation UN Security Council except the US to condemn Israel announcement of new construction activity in Palestinian territories and demand immediate dismantling of the illegal settlements. -----> description tag text

     UN Secretary General Ban Ki-moon also expressed his deep concern by the heightened settlement activity in West Bank, saying the move by Israel gravely threatens efforts to establish a viable Palestinian state.    ----> description tag text.
4

3 回答 3

4
于 2012-12-19T10:35:30.657 回答
0

用 Html.fromHTML() 运行节点值两到三遍就可以了。

说明:内置的 Html.fromHTML() 方法会将狂野和损坏的 HTML 转换为可用的内容。伪代码在这里:

sHTML = node.getNodeValue()
sHTML = Html.fromHTML(sHTML)
sHTML = Html.fromHTML(sHTML)
sHTML = Html.fromHTML(sHTML)

到第三次或第四次不可读的内容将再次变得可读。您可以在 textview 中显示它或使用 webview 加载数据。

于 2014-07-07T14:45:49.673 回答
0

简单地替换有问题的字符有什么问题吗?

string = string.replaceAll("&lt;", "");
string = string.replaceAll("div&gt;", "");
string = string.replaceAll("p&gt;", "");
于 2012-12-19T10:32:09.193 回答