4

如果我让 1 个工作人员执行任务,我有一个数据管道luigi可以正常工作。但是,如果我放置 > 1 个工作人员,那么它会在具有 2 个依赖项的阶段中死亡(意外退出代码为 -11)。代码相当复杂,所以很难给出一个最小的例子。问题的要点是我正在做以下事情gensim

  1. 从一些文本构建字典。
  2. 从所述文本和字典构建语料库(需要 (1))。
  3. 从语料库和字典中训练 LDA 模型(需要 (1) 和 (2))。

出于某种原因,每次我放置多个工人时,步骤 (3) 都会崩溃,即使 (1) 和 (2) 已经完成......

任何帮助将不胜感激!

编辑:这是日志信息的示例。TrainLDA 是任务 (3)。之后还有两个任务需要 TrainLDA。所有早期的任务都正确完成。我替换了 TrainLDA 的参数,...以便输出更具可读性。附加信息只是print我们用来帮助​​我们了解正在发生的事情的陈述。

开发银行

UG: Pending tasks: 3
DEBUG: Asking scheduler for work...
INFO: [pid 28851] Worker Worker(salt=514562349, workers=4, host=felipe.local, username=Felipe, pid=28825) running   TrainLDA(...)
INFO: Done
INFO: There are no more tasks to run at this time
INFO: TrainLDA(...) is currently run by worker Worker(salt=514562349, workers=4, host=felipe.local, username=Felipe, pid=28825)
==============================
Corriendo LDA de spanish con nivel de limpieza stopwords
==============================
Número de tópicos: 40
DEBUG: Asking scheduler for work...
INFO: Done
INFO: There are no more tasks to run at this time
INFO: TrainLDA(...) is currently run by worker Worker(salt=514562349, workers=4, host=felipe.local, username=Felipe, pid=28825)
DEBUG: Asking scheduler for work...
INFO: Done
INFO: There are no more tasks to run at this time
INFO: TrainLDA(...) is currently run by worker Worker(salt=514562349, workers=4, host=felipe.local, username=Felipe, pid=28825)
INFO: Worker task TrainLDA(...) died unexpectedly with exit code -11
DEBUG: Asking scheduler for work...
INFO: Done
INFO: There are no more tasks to run at this time
INFO: There are 2 pending tasks possibly being run by other workers
INFO: There are 2 pending tasks unique to this worker
INFO: Worker Worker(salt=514562349, workers=4, host=felipe.local, username=Felipe, pid=28825) was stopped. Shutting down Keep-Alive thread
4

0 回答 0