我有一个用 GB2312 编码的 RSS 提要
当我尝试使用以下代码解析它时:
for item in XML.ElementFromURL(feed).xpath('//item'):
title = item.find('title').text
它无法解析 Feed。
任何想法如何解析 GB2312 编码的 RSS 提要
使用如下编码后,来自 Plex 媒体服务器的错误日志如下
for item in XML.ElementFromURL(feed, encoding='gb2312').xpath('//item'):
title = item.find('title').text
:
***Error Log:***
> File "C:\Documents and Settings\subhendu.swain\Local Settings\Application Data\Plex Media Server\Plug-ins\Zaobao.bundle\Contents\Code\__init__.py", line 24, in GetDetails
for item in XML.ElementFromURL(feed, encoding='gb2312').xpath('//item'):
File "C:\Documents and Settings\subhendu.swain\Local Settings\Application Data\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\api\parsekit.py", line 81, in ElementFromURL
return self.ElementFromString(self._core.networking.http_request(url, values, headers, cacheTime, autoUpdate, encoding, errors, immediate=True, sleep=sleep, opener=self._opener, txn_id=self._txn_id).content, isHTML=isHTML)
File "C:\Documents and Settings\subhendu.swain\Local Settings\Application Data\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\api\parsekit.py", line 76, in ElementFromString
return self._core.data.xml.from_string(string, isHTML)
File "C:\Documents and Settings\subhendu.swain\Local Settings\Application Data\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\components\data.py", line 134, in from_string
return etree.fromstring(markup)
File "lxml.etree.pyx", line 2532, in lxml.etree.fromstring (src/lxml/lxml.etree.c:48270)
File "parser.pxi", line 1545, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:71812)
File "parser.pxi", line 1424, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:70673)
File "parser.pxi", line 938, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:67442)
File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:63824)
File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:64745)
File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64088)
XMLSyntaxError: switching encoding: encoder error, line 1, column 36
2011-09-28 09:34:33,453 (9d0) : DEBUG (core) - Response: 404