python - Python 2.7，解码问题（'utf-8'）

Question

我有：

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from urllib2 import urlopen

page2 = urlopen('http://pogoda.yandex.ru/moscow/').read().decode('utf-8')

page = urlopen('http://yasko.by/').read().decode('utf-8')

在“page ...”行中，我有错误“UnicodeDecodeError: 'utf8' codec can't decode byte 0xc3 in position 32: invalid continuation byte”，但在“page2 ...”行中没有错误，为什么？

从 yasko.by 中 32 的位置开始西里尔符号，我如何正确得到它？

谢谢！

score 2 · Accepted Answer

http://yasko.by/的内容用编码windows-1251，而http://pogoda.yandex.ru/moscow/的内容用编码utf-8。

page = ..行应变为：

page = urlopen('http://yasko.by/').read().decode('windows-1251')

python - Python 2.7，解码问题（'utf-8'）

1 回答 1

Related

Reference