0

我想在 Django 中创建周期性任务来运行 Scrapy 进程,但是日志中的任务似乎没有结束,我相信我需要将爬虫信号与 Huey 任务连接起来,我对此感到迷茫,知道吗?

INFO  2021-05-20 19:34:36,500 consumer.start :: Huey consumer started with 1 thread, PID 4000 at 2021-05-20 19:34:36.500022
INFO  2021-05-20 19:34:36,500 consumer.start :: Scheduler runs every 1 second(s).
INFO  2021-05-20 19:34:36,500 consumer.start :: Periodic tasks are enabled.
INFO  2021-05-20 19:34:36,500 consumer.start :: The following commands are available:
+ app.tasks.run_spider
[2021-05-20 19:34:48,260] INFO:huey:Worker-1:Executing app.tasks.run_spider: 1e843422-2ab6-48fa-bf8b-6b7fe25266f8
INFO  2021-05-20 19:40:07,864 consumer.run :: Received SIGINT
INFO  2021-05-20 19:40:07,865 consumer.stop :: Shutting down gracefully...
[2021-05-20 19:40:07,868] INFO:huey:Worker-1:app.tasks.run_spider: 1e843422-2ab6-48fa-bf8b-6b7fe25266f8 executed in 319.608s
INFO  2021-05-20 19:40:07,869 consumer.stop :: All workers have stopped.
INFO  2021-05-20 19:40:07,869 consumer.run :: Consumer exiting.
from multiprocessing import Process

from scrapy import signals
from scrapy.crawler import Crawler
from twisted.internet import reactor

def run_script():
    crawler = Crawler(MySpider, settings=settings.SCRAPY_SETTINGS)
    crawler.signals.connect(reactor.stop, signal=signals.spider_closed)
    reactor.run()

@db_task()
def run_spider():
    process = Process(target=run_script)
    process.start()
    process.join()
  • 色调==2.3.2
  • django==3.2.3
4

0 回答 0