0

我用过hadoop-0.20.xx,hive-0.11.0。我会谈论蜂巢查询:使用指定的配置,一切都很好并且工作正常。现在,我们已经升级到 hadoop-2.6.x (hadoop2) 和 hive-0.14.x。也使用 Apache Tez。

问题是,hadoop 按原样工作。但是 hive sql 查询没有。以下查询在旧版本中运行良好。但是在升级版本中抛出错误: QUERY :SELECT abc.property_name, xyz.date, xyz.time, xyz.value_as_number, xyz.value_units FROM dbname.xyz JOIN dbname.abc ON (xyz.id = abc.src_id) WHERE xyz.person_id=138312;

例外:

INFO  : Session is already open
INFO  : Tez session was closed. Reopening...
INFO  : Session re-established.
INFO  :

INFO  : Status: Running (Executing on YARN cluster with App id application_1435524970199_0035)

INFO  : Map 1: -/-      Map 2: -/-
ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1435524970199_0035_1_00, diagnostics=[Vertex vertex_1435524970199_0035_1_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: concept initializer failed, vertex=vertex_1435524970199_0035_1_00 [Map 1], java.io.IOException: No input paths specified in job
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getInputPaths(HiveInputFormat.java:318)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:328)
        at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:130)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
]
ERROR : Vertex failed, vertexName=Map 2, vertexId=vertex_1435524970199_0035_1_01, diagnostics=[Vertex vertex_1435524970199_0035_1_01 [Map 2] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: observation initializer failed, vertex=vertex_1435524970199_0035_1_01 [Map 2], java.io.IOException: No input paths specified in job
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getInputPaths(HiveInputFormat.java:318)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:328)
        at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:130)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
]
ERROR : DAG failed due to vertex failure. failedVertices:2 killedVertices:0
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask (state=08S01,code=2)

例外说,No input path specified。好吧,我了解并知道如何在 haodop-mapreduce 程序中解决问题。但是,我们如何使用 hive 查询来做到这一点。无论如何,我不认为这是一样的。

为了弄清楚,我使用了hive shelland beeline shell,hive 返回了预期的输出,但是,beeline 返回了与上面相同的异常。

问题的美妙之处在于对单个表的查询工作正常。但是,当我尝试处理时JOIN,它会引发上述异常。但是,我明白,Apache Tez对我的查询有影响。有人可以建议解决方案或确定 tez 参考,以便我可以相应地读取和重写查询。谢谢

4

1 回答 1

0

它通过禁用apache tez. 看起来apache tez还不稳定。

于 2015-07-08T08:26:46.410 回答