0

我正在从 URL 中检索以下内容以进行 XML 解析。我的问题出在<description>标签中,我想<description>在 HTML 视图中查看标签的内容。如果它只是一个文本,那么我可以使用Html.fromHtml()方法将其格式化为 HTML。但它与图像标签和文本混合在一起。我不为它创建视图。

<item>
<title>Delhi Protests: Senior leaders to meet today to finalise strategy</title>
<link></link>
<description>The senior-most leaders of the opposition will meet today to decide the party's strategy on the massive protests in Delhi demanding justice for the medical student...&lt;img width='1' height='1' src='http://xxx.com.feedsportal.com/c/33805/f/606695/s/26e4bf01/mf.gif' border='0'/&gt;&lt;div class='mf-viral'&gt;&lt;table border='0'&gt;&lt;tr&gt;&lt;td valign='middle'&gt;&lt;a href="http://share.feedsportal.com/viral/sendEmail.cfm?lang=en&amp;title=Students+Protests%3A+Senior+leaders+to+meet+today+to+finalise+strategy&amp;link=http%3A%2F%2Fwww.someweb.com%2Farticle%2Findia%2Fdelhi-protests-leaders-to-meet-today-to-finalise-strategy-309096" target="_blank"&gt;&lt;img src="http://res3.feedsportal.com/images/emailthis2.gif" border="0" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;td valign='middle'&gt;&lt;a href="http://res.feedsportal.com/viral/bookmark.cfm?title=Delhi+Protests%3A+Senior+leaders+to+meet+today+to+finalise+strategy&amp;link=http%3A%2F%2Fwww.somewebsite.com%2Farticle%2Findia%2Fdelhi-protests-senior-leaders-to-meet-today-to-finalise-strategy-309096" target="_blank"&gt;&lt;img src="http://res3.feedsportal.com/images/bookmark.gif" border="0" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/div&gt;&lt;br/&gt;&lt;br/&gt;&lt;a href="http://da.feedsportal.com/r/151883806109/u/49/f/606695/c/33805/s/26e4bf01/a2.htm"&gt;&lt;img src="http://da.feedsportal.com/r/151883806109/u/49/f/606695/c/33805/s/26e4bf01/a2.img" border="0"/&gt;&lt;/a&gt;&lt;img width="1" height="1" src="http://pi.feedsportal.com/r/151883806109/u/49/f/606695/c/33805/s/26e4bf01/a2t.img" border="0"/&gt;&lt;img src="http://feeds.feedburner.com/~r/NdtvNews-TopStories/~4/Dchb80wBdEY" height="1" width="1"/&gt;</description>

<category domain="">india</category>
<pubDate>Mon, 24 Dec 2012 06:28:35 GMT</pubDate>
<guid isPermaLink="false">309096</guid>
<feedburner:origLink>http://ndtv.com.feedsportal.com/c/33805/f/606695/s/26e4bf01/l/0L0Sndtv0N0Carticle0Cindia0Cdelhi0Eprotests0Esenior0Ebjp0Eleaders0Eto0Emeet0Etoday0Eto0Efinalise0Estrategy0E30A90A96/story01.htm</feedburner:origLink>

任何帮助表示赞赏....

4

1 回答 1

2

您可以使用任何可用的 html 解析器,例如HtmlCleanerJSoup. 这些可以从传递的内容中清除图像标签、url 并返回纯文本。

看看HtmlCleaner 文档

于 2012-12-24T12:02:27.930 回答