我有一部分代码
from bs4 import BeautifulSoup
for i in range(1,10):
print(str(i))
soup = BeautifulSoup(open("downloads/" + str(i) + ".html","rt"), 'html.parser')
text1 = soup.find_all("div", class_="content html_format")
text1 = text1[0].get_text()
print(text1)
执行后我得到一个错误:
Traceback (most recent call last):
File "classifier1.py", line 6, in <module>
soup = BeautifulSoup(open("downloads/" + str(i) + ".html","rt"), 'html.parser')
File "C:\Users\K18\Desktop\redictgames\classifier\bs4\__init__.py", line 175, in __init__
markup = markup.read()
File "C:\python34\lib\encodings\cp1251.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x98
in position 11152: character maps to <undefined>
我也尝试过'rb'和'r'模式,但这不起作用......
在文件中,我有带有俄语单词的文章。
几天前,它工作得很好