python - 通过python脚本调用蜘蛛时，JSON不能在scrapy中工作？

Question

当我通过 python 脚本调用我的蜘蛛时，如下所示：

import os
os.environ.setdefault('SCRAPY_SETTINGS_MODULE', 'project.settings')
from twisted.internet import reactor
from scrapy import log, signals
from scrapy.crawler import Crawler
from scrapy.settings import CrawlerSettings
from scrapy.xlib.pydispatch import dispatcher
from spiders.image import aqaqspider
def stop_reactor():
    reactor.stop()

dispatcher.connect(stop_reactor, signal=signals.spider_closed)
spider = aqaqspider(domain='aqaq.com')
crawler = Crawler(CrawlerSettings())
crawler.configure()
crawler.crawl(spider)
crawler.start()
log.start()
log.msg('Running reactor...')
reactor.run()  # the script will block here until the spider is closed
log.msg('Reactor stopped.')

我的 Json 文件没有被创建。我的 pipelines.py 有以下代码：

import json
import codecs

class JsonWithEncodingPipeline(object):

    def __init__(self):
        self.file = codecs.open('scraped_data_utf8.json', 'w', encoding='utf-8')

    def process_item(self, item, spider):
        line = json.dumps(dict(item), ensure_ascii=False) + "\n"
        self.file.write(line)
        return item

    def spider_closed(self, spider):
        self.file.close()

当我用简单的命令行调用我的蜘蛛作为scrapy crawl时，它工作正常，即正在创建JSON文件。

请帮我。我是scrapy的新手？？？

谢谢你们！！我找到了解决方案......

python - 通过python脚本调用蜘蛛时，JSON不能在scrapy中工作？

0 回答 0

Related

Reference