8

我现在有一个可怕的问题。当我在 hadoop 中运行作业时,映射过程正常,达到 100%,没有发生任何故障。但是,当reduce 进程运行时,它在达到67% 时停止。这很奇怪。我是hadoop的新手,在网上搜索了很多资料,但现在我仍然感到困惑。跟随是一个输出。

13/10/25 21:40:00 INFO input.FileInputFormat: Total input paths to process : 2
13/10/25 21:40:01 INFO mapred.JobClient: Running job: job_201310252001_0003
13/10/25 21:40:02 INFO mapred.JobClient:  map 0% reduce 0%
13/10/25 21:40:30 INFO mapred.JobClient:  map 1% reduce 0%
13/10/25 21:40:37 INFO mapred.JobClient:  map 2% reduce 0%
13/10/25 21:40:39 INFO mapred.JobClient:  map 3% reduce 0%
13/10/25 21:40:40 INFO mapred.JobClient:  map 4% reduce 0%
13/10/25 21:40:42 INFO mapred.JobClient:  map 5% reduce 0%
13/10/25 21:40:43 INFO mapred.JobClient:  map 6% reduce 0%
13/10/25 21:40:45 INFO mapred.JobClient:  map 7% reduce 0%
13/10/25 21:40:46 INFO mapred.JobClient:  map 9% reduce 0%
13/10/25 21:40:48 INFO mapred.JobClient:  map 10% reduce 0%
13/10/25 21:40:49 INFO mapred.JobClient:  map 11% reduce 0%
13/10/25 21:40:52 INFO mapred.JobClient:  map 14% reduce 0%
13/10/25 21:40:55 INFO mapred.JobClient:  map 17% reduce 0%
13/10/25 21:40:58 INFO mapred.JobClient:  map 19% reduce 0%
13/10/25 21:41:01 INFO mapred.JobClient:  map 22% reduce 0%
13/10/25 21:41:04 INFO mapred.JobClient:  map 23% reduce 0%
13/10/25 21:41:05 INFO mapred.JobClient:  map 24% reduce 0%
13/10/25 21:41:07 INFO mapred.JobClient:  map 26% reduce 0%
13/10/25 21:41:08 INFO mapred.JobClient:  map 27% reduce 0%
13/10/25 21:41:10 INFO mapred.JobClient:  map 28% reduce 0%
13/10/25 21:41:11 INFO mapred.JobClient:  map 29% reduce 0%
13/10/25 21:41:13 INFO mapred.JobClient:  map 30% reduce 0%
13/10/25 21:41:14 INFO mapred.JobClient:  map 31% reduce 0%
13/10/25 21:41:16 INFO mapred.JobClient:  map 32% reduce 0%
13/10/25 21:41:20 INFO mapred.JobClient:  map 34% reduce 0%
13/10/25 21:41:23 INFO mapred.JobClient:  map 35% reduce 0%
13/10/25 21:41:26 INFO mapred.JobClient:  map 36% reduce 0%
13/10/25 21:41:34 INFO mapred.JobClient:  map 37% reduce 0%
13/10/25 21:41:39 INFO mapred.JobClient:  map 38% reduce 0%
13/10/25 21:41:43 INFO mapred.JobClient:  map 40% reduce 0%
13/10/25 21:41:44 INFO mapred.JobClient:  map 40% reduce 6%
13/10/25 21:41:46 INFO mapred.JobClient:  map 42% reduce 6%
13/10/25 21:41:49 INFO mapred.JobClient:  map 43% reduce 6%
13/10/25 21:41:51 INFO mapred.JobClient:  map 44% reduce 6%
13/10/25 21:41:52 INFO mapred.JobClient:  map 45% reduce 6%
13/10/25 21:41:55 INFO mapred.JobClient:  map 46% reduce 6%
13/10/25 21:41:57 INFO mapred.JobClient:  map 47% reduce 6%
13/10/25 21:41:58 INFO mapred.JobClient:  map 48% reduce 9%
13/10/25 21:42:01 INFO mapred.JobClient:  map 51% reduce 12%
13/10/25 21:42:04 INFO mapred.JobClient:  map 54% reduce 12%
13/10/25 21:42:07 INFO mapred.JobClient:  map 56% reduce 12%
13/10/25 21:42:10 INFO mapred.JobClient:  map 58% reduce 12%
13/10/25 21:42:13 INFO mapred.JobClient:  map 60% reduce 12%
13/10/25 21:42:16 INFO mapred.JobClient:  map 61% reduce 12%
13/10/25 21:42:19 INFO mapred.JobClient:  map 62% reduce 15%
13/10/25 21:42:22 INFO mapred.JobClient:  map 63% reduce 15%
13/10/25 21:42:23 INFO mapred.JobClient:  map 65% reduce 15%
13/10/25 21:42:26 INFO mapred.JobClient:  map 66% reduce 15%
13/10/25 21:42:28 INFO mapred.JobClient:  map 67% reduce 15%
13/10/25 21:42:29 INFO mapred.JobClient:  map 68% reduce 15%
13/10/25 21:42:32 INFO mapred.JobClient:  map 69% reduce 15%
13/10/25 21:42:34 INFO mapred.JobClient:  map 70% reduce 18%
13/10/25 21:42:35 INFO mapred.JobClient:  map 72% reduce 18%
13/10/25 21:42:38 INFO mapred.JobClient:  map 75% reduce 18%
13/10/25 21:42:41 INFO mapred.JobClient:  map 77% reduce 18%
13/10/25 21:42:44 INFO mapred.JobClient:  map 80% reduce 18%
13/10/25 21:42:47 INFO mapred.JobClient:  map 82% reduce 18%
13/10/25 21:42:50 INFO mapred.JobClient:  map 85% reduce 18%
13/10/25 21:42:53 INFO mapred.JobClient:  map 87% reduce 18%
13/10/25 21:42:56 INFO mapred.JobClient:  map 88% reduce 18%
13/10/25 21:42:59 INFO mapred.JobClient:  map 89% reduce 18%
13/10/25 21:43:02 INFO mapred.JobClient:  map 90% reduce 18%
13/10/25 21:43:05 INFO mapred.JobClient:  map 91% reduce 18%
13/10/25 21:43:18 INFO mapred.JobClient:  map 94% reduce 21%
13/10/25 21:43:21 INFO mapred.JobClient:  map 97% reduce 21%
13/10/25 21:43:24 INFO mapred.JobClient:  map 99% reduce 27%
13/10/25 21:43:27 INFO mapred.JobClient:  map 100% reduce 30%
13/10/25 21:43:30 INFO mapred.JobClient:  map 100% reduce 67%
4

1 回答 1

28

这里的症状是你的代码在你的reduce阶段被“卡住”了,要么是因为无限循环,要么只是接收到大量可笑的数据,或者是其他原因(也许发布你的reduce代码?)。

以下是百分比在减速器中的工作方式:

  1. 0-33% 是洗牌。这是从映射器移动到化简器的数据(在映射器完成之前查看它是如何开始的)。
  2. 33%-67%是那种。这只能在映射器完成时开始(看看它是如何在映射为 100% 后从 30% 变为 67% 的)。
  3. 67%-100% 是您正在运行的实际减少代码。每次减少任务完成时,这个百分比都会上升。你的 reduce 任务都没有完成。

在 JobTracker 界面中,查看您的工作并查看 reducer 获取了多少数据。如果 reducer 中的记录数在增加,这意味着您可能有太多数据流向 reducer。如果该数字保持不变,则您可能会遇到某种无限循环。

于 2013-10-25T15:39:34.050 回答