0

我正在尝试使用 lxml 处理一些数据。它在我的开发服务器上运行良好,但在生产中使用以下代码:

parser = etree.XMLParser(encoding='cp1251')

抛出:

  File "parser.pxi", line 1288, in lxml.etree.XMLParser.__init__ (third_party/apphosting/python/lxml/src/lxml/lxml.etree.c:77726)
  File "parser.pxi", line 738, in lxml.etree._BaseParser.__init__ (third_party/apphosting/python/lxml/src/lxml/lxml.etree.c:73404)
LookupError: unknown encoding: 'cp1251'

我正在使用 lxml 2.3。GAE似乎支持相同的版本。那么为什么会出现这个错误呢?

编辑

我为 指定了不同的编码XMLParser,例如 cp1252、ISO-8859-5、ISO-8859-2,它总是在 GAE 上抛出相同的错误,但在我的本地机器上工作。这些是流行的编码,GAE 上的 lxml 必须支持它们。我相信这是基于 GAE 构建的 lxml 有问题。

我创建了一个问题:http ://code.google.com/p/googleappengine/issues/detail?id=7315

编辑2

完整追溯:

unknown encoding: 'cp1251'
Traceback (most recent call last):
  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1511, in __call__
    rv = self.handle_exception(request, response, e)
  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1505, in __call__
    rv = self.router.dispatch(request, response)
  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1253, in default_dispatcher
    return route.handler_adapter(request, response)
  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1077, in __call__
    return handler.dispatch()
  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 547, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 545, in dispatch
    return method(*args, **kwargs)
  File "/base/data/home/apps/s~my_cool_app_id/1.358126884781269352/main.py", line 29, in get
    parser = etree.XMLParser(encoding='cp1251')
  File "parser.pxi", line 1288, in lxml.etree.XMLParser.__init__ (third_party/apphosting/python/lxml/src/lxml/lxml.etree.c:77726)
  File "parser.pxi", line 738, in lxml.etree._BaseParser.__init__ (third_party/apphosting/python/lxml/src/lxml/lxml.etree.c:73404)
LookupError: unknown encoding: 'cp1251'
4

1 回答 1

1

在 OS X 上似乎存在一个关于此行为的错误,其中指定 encoding="cp1252" 会导致上述错误。评论还指出其他系统受到影响:https ://bugs.launchpad.net/lxml/+bug/707396

您是否尝试过指定其他编码类型?(看看是不是只是cp1252的问题)

于 2012-04-09T21:53:44.690 回答