0

我在独立机器上运行 hive。Hadoop 以伪分布式模式运行。我正在运行连接两个表的 hive 查询(一个表有 7M,另一个有 51M 记录,每个包含 8 列)。处理一段时间后,Mapper 达到零百分比,然后偶尔继续打印零。你能帮我解决这个问题吗?

参考下面的日志。

2016-04-12 22:52:58,469 Stage-1 map = 71%,  reduce = 1%
2016-04-12 22:53:00,517 Stage-1 map = 72%,  reduce = 1%
2016-04-12 22:53:02,560 Stage-1 map = 73%,  reduce = 1%
2016-04-12 22:53:09,740 Stage-1 map = 74%,  reduce = 1%
2016-04-12 22:53:11,796 Stage-1 map = 75%,  reduce = 1%
2016-04-12 22:53:13,842 Stage-1 map = 76%,  reduce = 1%
2016-04-12 22:53:21,037 Stage-1 map = 77%,  reduce = 1%
2016-04-12 22:53:24,114 Stage-1 map = 78%,  reduce = 1%
2016-04-12 22:53:26,156 Stage-1 map = 79%,  reduce = 1%
2016-04-12 22:53:35,433 Stage-1 map = 81%,  reduce = 1%
2016-04-12 22:53:38,507 Stage-1 map = 82%,  reduce = 1%
2016-04-12 22:53:45,725 Stage-1 map = 82%,  reduce = 0%
2016-04-12 22:53:49,925 Stage-1 map = 0%,  reduce = 0%
2016-04-12 22:54:50,236 Stage-1 map = 0%,  reduce = 0%
2016-04-12 22:55:50,546 Stage-1 map = 0%,  reduce = 0%
2016-04-12 22:56:50,863 Stage-1 map = 0%,  reduce = 0%
2016-04-12 22:57:51,128 Stage-1 map = 0%,  reduce = 0%
2016-04-12 22:58:51,352 Stage-1 map = 0%,  reduce = 0%
2016-04-12 22:59:51,612 Stage-1 map = 0%,  reduce = 0%
2016-04-12 23:00:51,886 Stage-1 map = 0%,  reduce = 0%
2016-04-12 23:01:52,131 Stage-1 map = 0%,  reduce = 0%

我验证了跟踪器中的状态。状态显示两次尝试,一次尝试失败,诊断消息如下。

AM Container for appattempt_1460481465127_0001_000001 exited with exitCode: -100
For more detailed output, check application tracking page:http://localhost:8088/cluster/app/application_1460481465127_0001Then, click on links to logs of each attempt.
Diagnostics: Container released on a *lost* nodeFailing this attempt

提前致谢。

4

1 回答 1

0

该问题似乎是由地图端的堆空间引起的。
尝试通过执行以下操作来增加映射任务堆大小:

in mapred-site.xml(尝试调整以下值以匹配您的用例):

<!-- for mappers -->
<property>
   <name>mapreduce.map.memory.mb</name>
   <value>4096</value>
</property>
<!-- For reduces -->
<property>
   <name>mapreduce.reduce.memory.mb</name>
   <value>8192</value>
</property>
于 2016-05-15T09:52:54.557 回答