我正在尝试以与仅使用scrapy 完全相同的方式使用scrapy splash 登录。我查看了文档Doc,它说“SplashFormRequest.from_response 也受支持,并且按照scrapy文档中的描述工作”但是,简单地更改一行代码并按照启动文档中的描述更改设置不会带来任何结果。我做错了什么?代码:
import scrapy
from scrapy_splash import SplashRequest
class MySpider(scrapy.Spider):
name = 'lost'
start_urls = ["myurl",]
def parse(self, response):
return SplashFormRequest.from_response(
response,
formdata={'username': 'pass', 'password': 'pass'},
callback=self.after_login
)
def after_login(self, response):
print response.body
if "keyword" in response.body:
self.logger.error("Success")
else:
self.logger.error("Failed")
添加到设置:
DOWNLOADER_MIDDLEWARES = { 'scrapy_splash.SplashCookiesMiddleware': 723, 'scrapy_splash.SplashMiddleware': 725, 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware': 810, } SPLASH_URL = 'http://localhost:8050' DUPEFILTER_CLASS = 'scrapy_splash.SplashAwareDupeFilter' HTTPCACHE_STORAGE = 'scrapy_splash.SplashAwareFSCacheStorage'
错误日志:
python@debian:~/Python/code/lostfilm$ scrapy crawl lost
2017-01-26 20:24:22 [scrapy.utils.log] INFO: Scrapy 1.3.0 started (bot: lostfilm)
2017-01-26 20:24:22 [scrapy.utils.log] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'lostfilm.spiders', 'ROBOTSTXT_OBEY': True, 'DUPEFILTER_CLASS': 'scrapy_splash.SplashAwareDupeFilter', 'SPIDER_MODULES': ['lostfilm.spiders'], 'BOT_NAME': 'lostfilm', 'HTTPCACHE_STORAGE': 'scrapy_splash.SplashAwareFSCacheStorage'}
2017-01-26 20:24:22 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.logstats.LogStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.corestats.CoreStats']
Unhandled error in Deferred:
2017-01-26 20:24:22 [twisted] CRITICAL: Unhandled error in Deferred:
2017-01-26 20:24:22 [twisted] CRITICAL:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
result = g.send(result)
File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 90, in crawl
six.reraise(*exc_info)
File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 72, in crawl
self.engine = self._create_engine()
File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 97, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 69, in __init__
self.downloader = downloader_cls(crawler)
File "/usr/local/lib/python2.7/dist-packages/scrapy/core/downloader/__init__.py", line 88, in __init__
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 58, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "/usr/local/lib/python2.7/dist-packages/scrapy/utils/misc.py", line 49, in load_object
raise NameError("Module '%s' doesn't define any object named '%s'" % (module, name))
NameError: Module 'scrapy.downloadermiddlewares.httpcompression' doesn't define any object named 'HttpCompresionMiddlerware'