如果我编写一个调用其他 celery 任务的 celery 任务,我可以释放父任务/worker 而不等待下游任务完成吗?
情况: 我正在使用一个 API,它返回一些数据和下一个 API 调用的参数。我想将 API 背后的所有数据放入数据库中。我目前的方法是查询要处理的批处理的API,启动一些下游处理器,然后递归地重新调用API+处理链。我担心当工人不关心他们孩子的结果时,这会锁定工人等待所有递归 API 调用完成。
伪代码:
@task
def apiPing(start=None):
""" Returns a dict of 5 elements, starting at the *start* element, or the
beginning of the list if start is not specified. Also present in the dict is 'remaining',
indicating how many elements are left in the API's list"""
return json.loads(api(start))
@task
def processList(data)
""" Takes a result from API ping, starts a task to store each element and a
chain to recall the API and process that."""
for element in data:
store(element).delay()
if data['remaining']!=0:
chain = chain(apiPing.s(data['last']), processList.s())
chain.delay()
我从这里了解到,上述情况非常接近于糟糕;在处理 API 中的所有数据之前,我不希望处理 processList() 的工作人员被锁定。有没有办法启动下游任务并释放父工人,或者重构上述不锁定工人?
测试表明,工人实际上是这样锁定的:
from celery import task
from time import sleep
@task
def parent():
print "In parent"
child.apply_async()
print "Out of parent"
@task
def child():
print "In child"
sleep(10)
print "Out of child"
[2013-08-05 18:37:29,264: WARNING/PoolWorker-4] In parent
[2013-08-05 18:37:31,278: WARNING/PoolWorker-2] In child
[2013-08-05 18:37:41,285: WARNING/PoolWorker-2] Out of child
[2013-08-05 18:37:41,298: WARNING/PoolWorker-4] Out of parent