7

我有两个脚本,scraper.py 和 db_control.py。在 scraper.py 我有这样的东西:

...
def scrape(category, field, pages, search, use_proxy, proxy_file):
    ...
    loop = asyncio.get_event_loop()

    to_do = [ get_pages(url, params, conngen) for url in urls ]
    wait_coro = asyncio.wait(to_do)
    res, _ = loop.run_until_complete(wait_coro)
    ...
    loop.close()
    
    return [ x.result() for x in res ]

...

在 db_control.py 中:

from scraper import scrape
...
while new < 15:
    data = scrape(category, field, pages, search, use_proxy, proxy_file)
    ...
...

理论上,刮板应该在未知时间启动,直到获得足够的数据。但是当new不是立即> 15发生时,会发生此错误:

  File "/usr/lib/python3.4/asyncio/base_events.py", line 293, in run_until_complete
self._check_closed()
  File "/usr/lib/python3.4/asyncio/base_events.py", line 265, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed

但是如果我只运行一次 scrape(),脚本就可以正常工作。所以我想重新创建有一些问题loop = asyncio.get_event_loop(),我已经尝试过,但没有任何改变。我该如何解决这个问题?当然,这些只是我的代码片段,如果您认为问题可能出在其他地方,请点击此处获取完整代码。

4

1 回答 1

8

方法run_until_complete, run_forever, run_in_executor, create_task,call_at显式检查循环并在它关闭时抛出异常。

引用文档 - BaseEvenLoop.close

这是幂等且不可逆的


除非您有一些(好的)理由,否则您可能会简单地省略关闭行:

def scrape(category, field, pages, search, use_proxy, proxy_file):
    #...
    loop = asyncio.get_event_loop()

    to_do = [ get_pages(url, params, conngen) for url in urls ]
    wait_coro = asyncio.wait(to_do)
    res, _ = loop.run_until_complete(wait_coro)
    #...
    # loop.close()
    return [ x.result() for x in res ]

如果您希望每次都有一个全新的循环,您不必手动创建它并将其设置为默认值:

def scrape(category, field, pages, search, use_proxy, proxy_file):
    #...
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)    
    to_do = [ get_pages(url, params, conngen) for url in urls ]
    wait_coro = asyncio.wait(to_do)
    res, _ = loop.run_until_complete(wait_coro)
    #...
    return [ x.result() for x in res ]
于 2016-06-12T20:36:02.043 回答