我有一段 I/O 绑定代码,基本上是为我的一个研究项目做一些网络抓取。
代码开始是命令式的,然后变成了列表理解,现在主要变成了生成器:
if __name__ == '__main__':
while True:
with suppress(Exception):
page = requests.get(baseUrl).content
urls = (baseUrl + link['href'] for link in BeautifulSoup(page,'html.parser').select('.tournament a'))
resources = (scrape_host(url) for url in urls)
keywords = ((keywords_for_resource(referer, site_id), rid) for
referer, site_id, rid in resources)
output = (scrape(years, animals) for years, animals in keywords)
responses = (post_data_with_exception_handling(list(data)) for data in output)
for response in responses:
print(response.status_code)
这种代码真的很适合我,因为它基于生成器,不需要存储太多状态,我想我可以很容易地将它变成asyncio
基于代码:
async def fetch(session, url):
with async_timeout.timeout(10):
async with session.get(url) as response:
return await response.text()
async def main(loop):
async with aiohttp.ClientSession(loop=loop) as session:
page = await fetch(session,baseUrl)
urls = (baseUrl + link['href'] for link in BeautifulSoup(page,'html.parser').select('.tournament a'))
subpages = (await fetch(session,url) for url in urls)
然而,在 Python 3.5 中,这只是返回 a Syntax error
,因为await
表达式不允许在推导中使用。
Python 3.6 承诺在 pep 530 中实现异步生成器。
此功能是否使我能够asyncio
轻松地将基于生成器的代码转换为代码,还是还需要完全重写?