我想解析来自百度的 xml 提要(DB2312 编码)http://news.baidu.com/n?cmd=1&class=civilnews&tn=rss
我总是出错
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 3, column 8
如果我将 xml 更改为谷歌提要http://news.google.com/news?cf=all&ned=us&hl=en&topic=b&output=rss,它可以工作。有什么建议么?
def get_feeds():
import sys
import xml.etree.ElementTree as etree
from urllib import urlopen
URL = "http://news.baidu.com/n?cmd=1&class=civilnews&tn=rss"
#URL = "http://news.google.com/news?cf=all&ned=us&hl=en&topic=b&output=rss"
tree = etree.parse(urlopen(URL))
if __name__ == '__main__':
get_feeds()