在运行基于 import requests-html 的简短脚本时,我收到一长串错误。我想了解错误发生的原因,并学习如何克服它。
这是我的代码:
import requests_html
from requests_html import HTML
with open('disinformation_index.html') as html_file:
source = html_file.read()
html = HTML(html=source)
print(html.text)
这是输出:
Traceback (most recent call last):
File "/Users/jp/webscraping_with_requests_html.py", line 5, in <module>
import requests_html
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/requests_html.py", line 9, in <module>
import pyppeteer
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyppeteer/__init__.py", line 30, in <module>
from pyppeteer.launcher import connect, launch, executablePath # noqa: E402
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyppeteer/launcher.py", line 24, in <module>
from pyppeteer.browser import Browser
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyppeteer/browser.py", line 13, in <module>
from pyppeteer.connection import Connection
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyppeteer/connection.py", line 12, in <module>
import websockets
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/websockets/__init__.py", line 3, in <module>
from .auth import *
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/websockets/auth.py", line 15, in <module>
from .server import HTTPResponse, WebSocketServerProtocol
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/websockets/server.py", line 49, in <module>
from .protocol import WebSocketCommonProtocol
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/websockets/protocol.py", line 18, in <module>
from typing import (
ImportError: cannot import name 'Deque'
*我在 Thonny IDE 上运行代码,并且我的解释器设置为 Python 3.6.0。我在 Mac Catalina 上。
看起来好像失败来自 requests-html 库中的子包。我对编码太陌生了,无法确定,但代码似乎确实访问了 requests-html 模块,但是在尝试读取它的某些元素时会中断。
是这样吗?有没有办法克服这个问题?
我的口译员(3.6.0)有问题吗?我读过 requests-html 仅在 3.6 中受支持,所以我认为我必须在 Thonny 中将其设置为我的解释器。
代码片段来自 YouTube 上的 Corey Schafer 网络抓取教程,这里:
https://www.youtube.com/watch?v=a6fIbtFB46g
有人建议,涉及编辑 protocol.py 文件以告诉它从 Collections 导入“Deque”而不是从 Typing(在 3.6 的“本机”版本中调用它)的帖子可能是解决方案。
我刚试过,失败了。
这是我在编辑 protocol.py 后的新错误消息(再次失败是“无法导入双端队列”)。
回溯(最近一次调用):文件“/Users/joelprestonsmith/webscraping_with_requests_html.py”,第 7 行,从 requests_html 导入 HTML 文件“/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site -packages/requests_html.py”,第 9 行,在导入 pyppeteer 文件“ /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyppeteer/init.py”,第 30 行,从 pyppeteer.launcher 导入连接、启动、可执行路径 # noqa:E402 文件“/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyppeteer/launcher .py”,第 24 行,从 pyppeteer.browser 导入浏览器文件“/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyppeteer/browser.py”,第 13 行,在从 pyppeteer.connection 导入连接文件“/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyppeteer/connection.py”,第 12 行,在导入 websockets 文件“/Library/Frameworks /Python.framework/Versions/3.6/lib/python3.6/site-packages/websockets/init _.py”,第 3 行,从 .auth 导入 * 文件“/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/websockets/auth.py”,第 15 行,从.server import HTTPResponse, WebSocketServerProtocol File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/websockets/server.py", 第49行, in from .protocol import WebSocketCommonProtocol File "/ Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/websockets/protocol.py”,第 13 行,从集合中导入 Deque ImportError: cannot import name 'Deque'
谢谢你。