1

hive-0.8.1-cdh4.0.1中调用 Reducer 的查询会导致任务失败。具有 MAPJOIn 的查询工作正常,但 JOIN 给出错误。

例如:

 hive> select count(*) from table1;    
       Total MapReduce jobs = 1    
       Launching Job 1 out of 1
       Number of reduce tasks determined at compile time: 1        
       12/10/15 23:07:02 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name     
       12/10/15 23:07:02 WARN conf.Configuration: mapred.system.dir is deprecated. Instead, use mapreduce.jobtracker.system.dir    
       12/10/15 23:07:02 WARN conf.Configuration: mapred.local.dir is deprecated. Instead, use mapreduce.cluster.local.dir   
       12/10/15 23:07:02 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH    
       WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.    
       Execution log at: /tmp/XXXX    
       /XXXX_20121015230707_c93521d0-4a97-4972-92b9-0fdd3ab42e5f.log    
       SLF4J: Class path contains multiple SLF4J bindings.    
       SLF4J: Found binding in [jar:file:/home/XXXX/hadoop-2.0.0-cdh4.0.1/share/hadoop/common/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]    
       SLF4J: Found binding in [jar:file:/home/XXXX/hive-0.8.1-cdh4.0.1/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]    
       SLF4J: See <http://www.slf4j.org/codes.html#multiple_bindings> for an explanation.
       Job running in-process (local Hadoop)     
       Hadoop job information for null: number of mappers: 0; number of reducers: 0    
       2012-10-15 23:07:04,721 null map = 0%,  reduce = 0%    
       Ended Job = job_local_0001 with errors    
       Error during job, obtaining debugging information...    
       **Execution failed with exit status: 2**    
       Obtaining error information    
       **Task failed!**    
       Task ID:    
       Stage-1    
       Logs:    
       /tmp/XXXX/hive.log    
       FAILED: Execution Error, return code 2 from      org.apache.hadoop.hive.ql.exec.MapRedTask       

日志文件显示这是由于 Java 堆空间问题。

**java.lang.Exception: java.lang.OutOfMemoryError: Java heap space**
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:400)
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:912)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:232)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)
4

2 回答 2

2

对于 hadoop 2.0.0 +,

in etc/hadoop/mapred-site.xml

放:

<property>
  <name>mapreduce.task.io.sort.mb</name>
  <value>1</value>
</property>

它会工作

于 2012-10-18T08:23:31.437 回答
0

地图连接将需要更多内存。

增加文件中的 mapreduce jvm 内存大小conf/mapred-site.xmlmapreduce 配置

<property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx1024m -server</value>
</property>
于 2012-11-01T16:31:21.850 回答