我正在评估使用 Cadence 执行长时间运行的批量操作。我有以下(Kotlin)代码:
class UpdateNameBulkWorkflowImpl : UpdateNameBulkWorkflow {
private val changeNamePromises = mutableListOf<Promise<ChangeNameResult>>()
override fun updateNames(newName: String, entityIds: Collection<String>) {
entityIds.forEach { entityId ->
val childWorkflow = Workflow.newChildWorkflowStub(
UpdateNameBulkWorkflow.UpdateNameSingleWorkflow::class.java
)
val promise = Async.function(childWorkflow::setName, newName, entityId)
changeNamePromises.add(promise)
}
val allDone = Promise.allOf(changeNamePromises)
allDone.get()
}
class UpdateNameSingleWorkflowImpl : UpdateNameBulkWorkflow.UpdateNameSingleWorkflow {
override fun setName(newName: String, entityId: String): SetNameResult {
return Async.function(activities::setName, newName, entityId).get()
}
}
}
这适用于较少数量的实体,但我很快遇到以下异常:
java.lang.RuntimeException: Failure processing decision task. WorkflowID=b5327d20-6ea6-4aba-b863-2165cb21e038, RunID=c85e2278-e483-4c81-8def-f0cc0bd309fd
at com.uber.cadence.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:283) ~[cadence-client-2.7.4.jar:na]
at com.uber.cadence.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:229) ~[cadence-client-2.7.4.jar:na]
at com.uber.cadence.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:76) ~[cadence-client-2.7.4.jar:na]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[na:na]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[na:na]
at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]
Caused by: com.uber.cadence.internal.sync.WorkflowRejectedExecutionError: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@7f17a605[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@7fa9f240[Wrapped task = com.uber.cadence.internal.sync.WorkflowThreadImpl$RunnableWrapper@1a27000b]] rejected from java.util.concurrent.ThreadPoolExecutor@22188bd0[Running, pool size = 600, active threads = 600, queued tasks = 0, completed tasks = 2400]
at com.uber.cadence.internal.sync.WorkflowThreadImpl.start(WorkflowThreadImpl.java:281) ~[cadence-client-2.7.4.jar:na]
at com.uber.cadence.internal.sync.AsyncInternal.execute(AsyncInternal.java:300) ~[cadence-client-2.7.4.jar:na]
at com.uber.cadence.internal.sync.AsyncInternal.function(AsyncInternal.java:111) ~[cadence-client-2.7.4.jar:na]
...
看来我很快就耗尽了线程池,而 Cadence 无法安排新任务。
我通过将定义更改updateNames
为:
override fun updateNames(newName: String, entityIds: Collection<String>) {
entityIds.chunked(200).forEach { sublist ->
val promises = sublist.map { entityId ->
val childWorkflow = Workflow.newChildWorkflowStub(
UpdateNameBulkWorkflow.UpdateNameSingleWorkflow::class.java
)
Async.function(childWorkflow::setName, newName, entityId)
}
val allDone = Promise.allOf(promises)
allDone.get()
}
}
这基本上处理了 200 个块中的项目,并等待每个块完成,然后再移动到下一个块。我担心这将如何执行(块中的单个错误将在重试时停止处理以下块中的所有记录)。我还关心 Cadence 在发生崩溃时能够恢复此功能的进度的能力。
我的问题是:是否有一种惯用的 Cadence 方式来做到这一点,不会导致这种立即的资源耗尽?我使用了错误的技术还是这只是一种幼稚的方法?