1

我有大约 4000 个文件(每个平均约 7MB)输入。

当数据大小达到大约 4GB 时,我的管道在步骤 CoGroupByKey 上总是失败。我试图限制只使用 300 个文件然后它运行得很好。

如果失败,GCP 数据流上的日志仅显示:

Workflow failed. Causes: S24:CoGroup Geo data/GroupByKey/Read+CoGroup Geo data/GroupByKey/GroupByWindow+CoGroup Geo data/Map(_merge_tagged_vals_under_key) failed., The job failed because a work item has failed 4 times. Look in previous log entries for the cause of each one of the 4 failures. For more information, see https://cloud.google.com/dataflow/docs/guides/common-errors. The work item was attempted on these workers: 
  store-migration-10212040-aoi4-harness-m7j7
      Root cause: The worker lost contact with the service.,
  store-migration-xxxxx
      Root cause: The worker lost contact with the service.,
  store-migration-xxxxx
      Root cause: The worker lost contact with the service.,
  store-migration-xxxxx
      Root cause: The worker lost contact with the service.

我在日志资源管理器中挖掘所有日志。除上述内容外,没有其他任何指示错误,即使是我的logging.infotry...except代码。

认为这与实例的记忆有关,但我没有深入那个方向。因为它有点像我在使用 GCP 服务时不想担心的事情。

谢谢。

4

0 回答 0