我写了一个蜘蛛,第一次效果很好。我第二次尝试运行它时,它并没有超出start_urls
. 我尝试fetch
输入 url并从返回的响应中scrapy shell
创建一个对象。HtmlXPathSelector
那是我得到错误的时候
所以步骤是:`
[scrapy shell] fetch('http://example.com') #its something other than example.
[scrapy shell] from scrapy.selector import HtmlXPathSelector
[scrapy shell] hxs = HtmlXPathSelector(response)
---------------------------------------------------------------------------
追溯:
AttributeError Traceback (most recent call last)
<ipython-input-3-a486208adf1e> in <module>()
----> 1 HtmlXPathSelector(response)
/home/codefreak/project-r42catalog/env-r42catalog/lib/python2.7/site-packages/scrapy/selector/lxmlsel.pyc in __init__(self, response, text, namespaces, _root, _expr)
29 body=unicode_to_str(text, 'utf-8'), encoding='utf-8')
30 if response is not None:
---> 31 _root = LxmlDocument(response, self._parser)
32
33 self.namespaces = namespaces
/home/codefreak/project-r42catalog/env-r42catalog/lib/python2.7/site-packages/scrapy/selector/lxmldocument.pyc in __new__(cls, response, parser)
25 if parser not in cache:
26 obj = object_ref.__new__(cls)
---> 27 cache[parser] = _factory(response, parser)
28 return cache[parser]
29
/home/codefreak/project-r42catalog/env-r42catalog/lib/python2.7/site-packages/scrapy/selector/lxmldocument.pyc in _factory(response, parser_cls)
11 def _factory(response, parser_cls):
12 url = response.url
---> 13 body = response.body_as_unicode().strip().encode('utf8') or '<html/>'
14 parser = parser_cls(recover=True, encoding='utf8')
15 return etree.fromstring(body, parser=parser, base_url=url)
错误:
AttributeError: 'Response' object has no attribute 'body_as_unicode'
我是否忽略了一些非常明显的东西或偶然发现了scrapy中的错误?