运行 scrapy 的第一个教程时出错。
Scrapy:0.22.2
lxml:3.3.5.0
libxml2:2.7.8
Twisted:12.0.0
Python:2.7.2(默认,2012 年 10 月 11 日,20:14:37)-[GCC 4.2.1 兼容 Apple Clang 4.0(标签/Apple/clang-418.0.60)]
平台:Darwin-12.5.0-x86_64-i386-64bit
这是我的文件 items.py:
from scrapy.item import Item, Field
class DmozItem(Item)
title=Field()
link=Field()
desc=Field()
我的 dmoz_spider.py 文件:从 scrapy.spider 导入 BaseSpider
class DmozSpider(BaseSpider):
name = "dmoz"
allowed_domains= ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]
def parse(self, response):
filename = response.url.split("/")[-2]
open(filename, 'wb').write(response.body)
这是运行“scrapy crawl dmoz”时的错误消息
傻瓜-imac-2:教程傻瓜$ scrapy crawl dmoz /usr/local/share/tutorial/tutorial/spiders/dmoz_spider.py:3:ScrapyDeprecationWarning:tutorial.spiders.dmoz_spider.DmozSpider继承自弃用的类scrapy.spider.BaseSpider,请从 scrapy.spider.Spider 继承。(仅对第一个子类发出警告,可能还有其他子类) class DmozSpider(BaseSpider):
2014-06-19 14:53:00-0500 [scrapy] 信息:Scrapy 0.22.2 已启动(机器人:教程)
2014-06-19 14:53:00-0500 [scrapy] 信息:可用的可选功能:ssl, http11
2014-06-19 14:53:00-0500 [scrapy] 信息:覆盖设置:{'NEWSPIDER_MODULE':'tutorial.spiders','SPIDER_MODULES':['tutorial.spiders'],'BOT_NAME':'tutorial '} 2014-06-19 14:53:00-0500 [scrapy] 信息:启用的扩展:LogStats、TelnetConsole、CloseSpider、WebService、CoreStats、SpiderState
Traceback(最近一次通话最后):文件“/usr/local/bin/scrapy”,第 5 行,在 pkg_resources.run_script('Scrapy==0.22.2', 'scrapy')
文件“/System/Library/Frameworks/Python.framework/Versions/2.7/ Extras/lib/python/pkg_resources.py”,第 489 行,run_script self.require(requires)[0].run_script(script_name, ns)
文件“/System/Library/Frameworks/Python.framework/Versions/2.7/Extras /lib/python/pkg_resources.py”,第 1207 行,在 run_script execfile(script_filename, namespace, namespace)
文件“/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/EGG- INFO/scripts/scrapy”,第 4 行,在 execute()
文件中“/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/cmdline.py”,第 143 行,在执行_run_print_help(解析器,_run_command,cmd,args,选择)
文件“/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/cmdline.py”,第 89 行,在 _run_print_help func(*a, **kw)
文件“/Library /Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/cmdline.py”,第 150 行,在 _run_command cmd.run(args, opts)
文件“/Library/Python/2.7/ site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/commands/crawl.py”,第 50 行,在运行 self.crawler_process.start()
文件“/Library/Python/2.7/site-packages/ Scrapy-0.22.2-py2.7.egg/scrapy/crawler.py”,第 92 行,在 start if self.start_crawling():
File "/Library/Python/2.7/site-packages/Scrapy-0.22.2- py2.7.egg/scrapy/crawler.py",第 124 行,在 start_crawling 中返回 self._start_crawler() 不是 None
文件“/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/crawler.py”,第 139 行,在 _start_crawler crawler.configure()
文件“/Library/Python/2.7 /site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/crawler.py”,第 47 行,在配置 self.engine = ExecutionEngine(self, self._spider_closed)
文件“/Library/Python/2.7/ site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/core/engine.py”,第 63 行,在init self.downloader = Downloader(crawler)
文件“/Library/Python/2.7/site-packages /Scrapy-0.22.2-py2.7.egg/scrapy/core/downloader/init .py",第 73 行,在init self.handlers = DownloadHandlers(crawler)
文件“/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/core/downloader/handlers/init .py”,第 18 行,在init cls = load_object (clspath)
文件中“/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/utils/misc.py”,第 40 行,在 load_object mod = import_module(module)
文件中“/System/Library /Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/init .py”,第 37 行,在 import_module 导入(名称)
文件“/Library/Python/2.7/site-packages/Scrapy-0.22. 2-py2.7.egg/scrapy/core/downloader/handlers/s3.py”,第 4 行,从 .http 导入 HTTPDownloadHandler
文件“/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/core/downloader/handlers/http.py”,第 5 行,从 .http11 导入 HTTP11DownloadHandler 作为 HTTPDownloadHandler
文件“/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/core/downloader/handlers/http11.py”,第 15 行,从 scrapy.xlib.tx 导入代理, ProxyAgent,ResponseDone,\
文件“/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/xlib/tx/init .py ”,第 6 行,来自 . 导入客户端,端点
文件“/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/xlib/tx/client.py”,第 37 行,从 .endpoints 导入 TCP4ClientEndpoint, SSL4ClientEndpoint
文件“/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/xlib/tx/endpoints.py”,第 222 行,在 interfaces.IProcessTransport,'_process')):
文件“/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/zope/interface/declarations.py”,第 495 行,调用 raise TypeError("Can't use implementer with classes。使用“类型错误之一
:不能将实现器与类一起使用。请改用类声明函数之一。