scrapy - How to continue crawling after return item in Scrapy?

Question

In my spider after receiving response I want to download and show captcha image and then continue crawling:

    def get_captcha(self, response):
        print '\nLoading captcha...\n'
        item = CaptchaItem()
        hxs = HtmlXPathSelector(response)
        captcha_img_src = hxs.select('//*[@id="captcha-image"]/@src').extract()[0]
        item['image_urls'] = [captcha_img_src]
        return item

But I don't know when image is loaded and how to continue crawling after that.

FYI: Captcha image can't be downloaded without cookies.

Thanks in advance!

score 0 · Accepted Answer

Use yield instead of return:

 def get_captcha(self, response):
    print '\nLoading captcha...\n'
    item = CaptchaItem()
    hxs = HtmlXPathSelector(response)
    captcha_img_src = hxs.select('//*[@id="captcha-image"]/@src').extract()[0]
    item['image_urls'] = [captcha_img_src]
    yield item
    #you may display here your scraped item and after that
    #your further post request goes here...
    yield your_request

scrapy - How to continue crawling after return item in Scrapy?

1 回答 1

Related

Reference