python - 如何通过 plone 浏览器代理 Web 服务，生成 utf-8 编码的 xml

Question

我有一个 Plone 3.3 站点，它从 RESTful Web 服务中检索信息；该服务返回 utf-8 编码的 xml 数据。请求是通过一个特殊的浏览器发送的（让我们称之为@@proxy这个问题）。只要不返回非 ASCII 数据，一切正常。

浏览器的get方法是这样的：

def get(self, thedict=None):
    """
    reduced version for stackoverflow
    """
    context = self.context
    request = context.request
    response = request.response
    if thedict is None:
        thedict = request.form
    theurl = '?'.join((basejoin(SERVICEBASE, thedict['path']),
                       urlencode(auth_data),
                       ))
    fo = urlopen(theurl)
    code = fo.code
    headers = fo.info().headers
    for line in headers:
        name, text = line.split(':', 1)
        response.setHeader(name, text.strip())    # trailing \r\n
    text = unicode(fo.read(), 'utf-8')
    # --- debugging only ... -- 8< ----- 8< ----- 8< ----- 8< ----- 8<
    CHARSHERE = set(list(text))
    funnychars = CHARSHERE.difference(XMLCHARS)
    if funnychars:
        funnychars = u''.join(tuple(funnychars))
        logger.info('funny chars: %r' % funnychars)
    elif 1:
        logger.info('funny chars: none')
    # --- ... debugging only -- >8 ----- >8 ----- >8 ----- >8 ----- >8
    response.setBody(text.encode('utf-8'))

首先，我没有对编码（unicode，encode）做任何事情。但是，这似乎是必要的；当包含诸如变音符号等非 ASCII 数据时（例如INFO proxy@@w3l funny chars: u'\xdf'），浏览器不喜欢结果。当我直接尝试相同的请求（到theurl，而不是通过我的@@proxy浏览器）时，它可以工作。问题不取决于互联网浏览器，即Firefox 和IE 咳嗽。

有一个Content-Type值为“ text/xml;charset=UTF-8”的标头；在分号后面插入一个空格字符并没有改变任何东西。

编辑：这就是 Seamonkey 所说的：

XML-Verarbeitungsfehler: Kein Element gefunden
Adresse: http://.../@@proxy/get
Zeile Nr. 1, Spalte 1:

^

（英文：xml processing error: no element found; row #1, column #1）

由于我不是该服务的作者，因此我不完全确定数据是否真实 UTF-8（尽管 HTTP 标头和<?xml ...>行是这样说的）。

我的错误是什么？我该怎么做才能找到问题？我是否遗漏了有关 Zope 响应对象和/或的重要信息urllib2.urlopen？

我也尝试过类似的东西（浓缩）：

fo = urlopen(theurl)
raw = fo.read().strip()
try:
    text = unicode(raw, 'utf-8')
    logger.info('Text passes off as utf-8')
except Exception, e:
    logger.info('no valid utf-8:')
    logger.exception(e)
    text = unicode(raw, 'latin-1')

……或者反过来；我从来没有遇到任何解码错误。

我将“原始”数据写入了一个文件 ( open(filename, 'wb'))，我用 vim ( set enc? fenc?) 检查了该文件，生成了utf-8.

我无计可施。

python - 如何通过 plone 浏览器代理 Web 服务，生成 utf-8 编码的 xml

0 回答 0

Related

Reference