我正在尝试抓取craiglist。当我尝试在蜘蛛中获取 https://tampa.craigslist.org/search/jjj?query=bookkeeper时,出现以下错误:
(为了便于阅读,添加了额外的换行符和空格)
[scrapy.downloadermiddlewares.retry] DEBUG:
Retrying <GET https://tampa.craigslist.org/search/jjj?query=bookkeeper> (failed 1 times):
[<twisted.python.failure.Failure twisted.internet.error.ConnectionLost:
Connection to the other side was lost in a non-clean fashion: Connection lost.>]
但是,当我尝试在scrapy shell上抓取它时,它被成功抓取了。
[scrapy.core.engine] DEBUG:
Crawled (200) <GET https://tampa.craigslist.org/search/jjj?query=bookkeeper>
(referer: None)
我不知道我在这里做错了什么。我曾尝试强制使用 TLSv1.2,但没有运气。我将衷心感谢您的帮助。谢谢!