0

我有一部分代码

from bs4 import BeautifulSoup

for i in range(1,10):
    print(str(i))
    soup = BeautifulSoup(open("downloads/" + str(i) + ".html","rt"), 'html.parser')
    text1 = soup.find_all("div", class_="content html_format")
    text1 = text1[0].get_text()
    print(text1)

执行后我得到一个错误:

Traceback (most recent call last):

File "classifier1.py", line 6, in <module>
    soup = BeautifulSoup(open("downloads/" + str(i) + ".html","rt"), 'html.parser')
File "C:\Users\K18\Desktop\redictgames\classifier\bs4\__init__.py", line 175, in __init__
    markup = markup.read()
File "C:\python34\lib\encodings\cp1251.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]

UnicodeDecodeError: 'charmap' codec can't decode byte 0x98
in position 11152: character maps to <undefined>

我也尝试过'rb'和'r'模式,但这不起作用......

在文件中,我有带有俄语单词的文章。

几天前,它工作得很好

4

0 回答 0