hadoop - 如何更改 Hadoop 集群中的最大容器容量

Question

我按照以下说明在 HORTONWORKS SANDBOX 上安装了 RHADOOP： http ://www.research.janahang.com/install-rhadoop-on-hortonworks-hdp-2-0/

一切似乎都已正确安装。但是当我在底部运行测试脚本时出现错误，似乎 - （REDUCE 所需的能力超过集群中支持的最大容器能力。终止作业。reduceResourceReqt：4096 maxContainerCapability:2250）很可能是我的问题。

如何设置 maxcontainercapability ？或解决这个问题？欢迎任何帮助。谢谢

错误输出在这里：

Be sure to run hdfs.init()
14/09/09 14:29:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for     your platform... using builtin-java classes where applicable
14/09/09 14:29:27 WARN hdfs.BlockReaderLocal: The short-circuit local reads feature     cannot be used because libhadoop cannot be loaded.
packageJobJar: [] [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.4.0.2.1.1.0-385.jar]     /tmp/streamjob4407691883964292767.jar tmpDir=null
14/09/09 14:29:29 INFO client.RMProxy: Connecting to ResourceManager at     sandbox.hortonworks.com/192.168.32.128:8050
14/09/09 14:29:29 INFO client.RMProxy: Connecting to ResourceManager at     sandbox.hortonworks.com/192.168.32.128:8050
14/09/09 14:29:31 INFO mapred.FileInputFormat: Total input paths to process : 1
14/09/09 14:29:32 INFO mapreduce.JobSubmitter: number of splits:2
14/09/09 14:29:32 INFO mapreduce.JobSubmitter: Submitting tokens for job:     job_1410297633075_0001
14/09/09 14:29:33 INFO impl.YarnClientImpl: Submitted application     application_1410297633075_0001
14/09/09 14:29:33 INFO mapreduce.Job: The url to track the job:     http://sandbox.hortonworks.com:8088/proxy/application_1410297633075_0001/
14/09/09 14:29:33 INFO mapreduce.Job: Running job: job_1410297633075_0001
14/09/09 14:29:42 INFO mapreduce.Job: Job job_1410297633075_0001 running in uber mode :     false
14/09/09 14:29:42 INFO mapreduce.Job:  map 100% reduce 100%
14/09/09 14:29:43 INFO mapreduce.Job: Job job_1410297633075_0001 failed with state     KILLED due to: MAP capability required is more than the supported max container capability     in the cluster. Killing the Job. mapResourceReqt: 4096 maxContainerCapability:2250
Job received Kill while in RUNNING state.
REDUCE capability required is more than the supported max container capability in the     cluster. Killing the Job. reduceResourceReqt: 4096 maxContainerCapability:2250

14/09/09 14:29:43 INFO mapreduce.Job: Counters: 2
    Job Counters
            Total time spent by all maps in occupied slots (ms)=0
            Total time spent by all reduces in occupied slots (ms)=0    
14/09/09 14:29:43 ERROR streaming.StreamJob: Job not Successful!
Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce,  :
hadoop streaming failed with error code 1
Calls: wordcount -> mapreduce -> mr
Execution halted
14/09/09 14:29:49 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion     interval = 360 minutes, Emptier interval = 0 minutes.
Moved: 'hdfs://sandbox.hortonworks.com:8020/tmp/file1f937beb4f39' to trash at:     hdfs://sandbox.hortonworks.com:8020/user/root/.Trash/Current

score 1 · Accepted Answer

要在 Hortonworks 2.1 上执行此操作，我必须

将 VirtualBox 内存从 4096 增加到 8192（不知道这是否是绝对必要的）
从http://my.local.host:8000启用 Ambari
从http://my.local.host:8080登录 Ambari
将 yarn.nodemanager.resource.memory-mb 和 yarn.scheduler.maximum-allocation-mb 的值从默认值更改为 4096
保存并重新启动所有内容（通过 Ambari）

这让我摆脱了“所需的能力”错误，但实际的 wordcount.R 似乎并不想完成。然而，像 hdfs.ls("/data") 这样的东西确实有效。

score -2 · Accepted Answer

-2

这个内存问题不容易解决，但是我切换到 Cloudera 平台，一切都按预期工作。

于 2014-09-19T17:26:36.290 回答

hadoop - 如何更改 Hadoop 集群中的最大容器容量

2 回答 2

Related

Reference