13

我正在使用一个简单的 python 脚本来获取我的 CID 的预订结果: simple.py:

data = {"minorRev":"current minorRev #","cid":"xxx","apiKey":"xxx","customerIpAddress":"  ","creationDateStart":"03/31/2013","}

url = 'http://someservice/services/rs/'                      
req = requests.get(url,params=data)                        
print req                                                                 
print req.text                                                                
print req.status_code

现在在命令提示符下,如果我这样做python simple.py,它会完美运行并打印req.text变量

但是,当我尝试做

python simple.py | grep pattern

我明白了

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 1314: ordinal not in range(128)
4

2 回答 2

22

print 需要在发送到 stdout 之前对字符串进行编码,但是当进程在管道中时,值为sys.stdout.encodingis None,因此print接收一个unicode对象,然后它尝试使用编解码器对该对象进行编码——如果此对象中ascii有非 ASCII 字符unicode,将引发异常。

unicode您可以在将所有对象发送到标准输出之前对其进行编码来解决此问题(但您需要猜测要使用哪个编解码器)。请参阅以下示例:

文件wrong.py

# coding: utf-8

print u'Álvaro'

结果:

alvaro@ideas:/tmp
$ python wrong.py 
Álvaro
alvaro@ideas:/tmp
$ python wrong.py | grep a
Traceback (most recent call last):
  File "wrong.py", line 3, in <module>
    print u'Álvaro'
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc1' in position 0: ordinal not in range(128)

文件right.py

# coding: utf-8

print u'Álvaro'.encode('utf-8')
# unicode object encoded == `str` in Python 2

结果:

alvaro@ideas:/tmp
$ python right.py 
Álvaro
alvaro@ideas:/tmp
$ python right.py | grep a
Álvaro
于 2013-04-01T09:27:07.360 回答
7

如果sys.stdout.isatty()为 false(输出被重定向到文件/管道),则PYTHONIOENCODING在脚本之外配置 envvar。始终打印 Unicode,不要在脚本中硬编码环境的字符编码:

$ PYTHONIOENCODING=utf-8 python simple.py | grep pattern
于 2016-01-30T11:39:34.643 回答