python - 插入字典后的Python字符串“转换”

Question

我得到一个非英文文本，当我将它附加到字典时，它会生成像文本一样的“\xe0\xa6\xb9\xe0\xa6\xb0\xe0\xa6\x”。

Example:
obj = {}
title = 'non english text'
print "title ...",title
obj['title'] = title
print obj

它的回归：

    title... non english text
   {'title': '\xe0\xa6\xb9\xe0\xa6\'}

任何想法，我该如何解决？

提前致谢。

score 3 · Accepted Answer

您正在查看 UTF-8 编码的数据：

>>> '\xe0\xa6\xb9\xe0\xa6\xb0'.decode('utf8')
u'\u09b9\u09b0'
>>> print '\xe0\xa6\xb9\xe0\xa6\xb0'.decode('utf8')
হর

要将其解码为 Unicode 文本，请使用.decode('utf8'). 如果您将该字符串直接打印到终端并且您的终端配置为处理 UTF-8，它将显示为您解码的那些字符，但该dict表示显示包含数据的 python 文字表示。

score 0 · Accepted Answer

这是一个 Unicode 处理错误。在Python 3中，一切text都是unicode- 试一试，您在非 ascii 字符集中的示例应该可以工作，并且您会省去一些麻烦。

如果你被 Python 2.x 困住了，请注意 Martijn 所说的话——他很成功。

2 回答 2