21

是否可以使用 asyncio 进行多个循环?如果回答是肯定的,我该怎么做?我的用例是: * 我从异步网站列表中提取 url * 对于每个“子 url 列表”,我会在 async/ 中抓取它们

提取网址的示例:

import asyncio
import aiohttp
from suburls import extractsuburls

@asyncio.coroutine
def extracturls(url):
    subtasks = []
    response = yield from aiohttp.request('GET', url)
    suburl_list = yield from response.text()
    for suburl in suburl_list:
        subtasks.append(asyncio.Task(extractsuburls(suburl)))
     loop = asyncio.get_event_loop()
     loop.run_until_complete(asyncio.gather(*subtasks))

 if __name__ == '__main__':
     urls_list = ['http://example1.com', 'http://example2.com']
     for url in url_list: 
          subtasks.append(asyncio.Task(extractsuburls(url)))  
     loop = asyncio.get_event_loop()
     loop.run_until_complete(asyncio.gather(*subtasks))
     loop.close()

如果我执行此代码,当 python 将尝试启动第二个循环时,我会遇到一个错误,女巫说一个循环已经在运行。

PS:我的模块“extractsuburls”使用 aiohttp 来执行 web 请求。

编辑:

好吧,我已经尝试过这个解决方案:

import asyncio
import aiohttp
from suburls import extractsuburls

@asyncio.coroutine
def extracturls( url ):
    subtasks = []
    response = yield from aiohttp.request('GET', url)
    suburl_list = yield from response.text()
    jobs_loop = asyncio.new_event_loop()
    for suburl in suburl_list:
        subtasks.append(asyncio.Task(extractsuburls(suburl)))
     asyncio.new_event_loop(jobs_loop)
     jobs_loop.run_until_complete(asyncio.gather(*subtasks))
     jobs_loop.close()

 if __name__ == '__main__':
     urls_list = ['http://example1.com', 'http://example2.com']
     for url in url_list: 
          subtasks.append(asyncio.Task(extractsuburls(url)))  
     loop = asyncio.get_event_loop()
     loop.run_until_complete(asyncio.gather(*subtasks))
     loop.close()

但我有这个错误:循环参数必须与未来一致

任何的想法?

4

1 回答 1

35

您不需要多个事件循环,只需yield from gather(*subtasks)extracturls()协程中使用:

import asyncio
import aiohttp
from suburls import extractsuburls

@asyncio.coroutine
def extracturls(url):
    subtasks = []
    response = yield from aiohttp.request('GET', url)
    suburl_list = yield from response.text()
    for suburl in suburl_list:
        subtasks.append(extractsuburls(suburl))
    yield from asyncio.gather(*subtasks)

 if __name__ == '__main__':
     urls_list = ['http://example1.com', 'http://example2.com']
     for url in url_list: 
          subtasks.append(extractsuburls(url))
     loop = asyncio.get_event_loop()
     loop.run_until_complete(asyncio.gather(*subtasks))
     loop.close()

结果,您将等待子任务直到extracturls完成。

于 2014-12-22T18:50:39.083 回答