我正在使用 urllib 加载网页。有 eis 俄罗斯符号,但页面编码是 'utf-8'
1
pageData = unicode(requestHandler.read()).decode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 262: ordinal not in range(128)
2
pageData = requestHandler.read()
soupHandler = BeautifulSoup(pageData)
print soupHandler.findAll(...)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 340-345: ordinal not in range(128)