0

我正在尝试使用 FLOAT 作为 Hive 的时间戳,但我收到了 IllegalArgumentException。根据文档,这应该在 Unix 时间中读取为秒,具有小数精度:https ://cwiki.apache.org/Hive/languagemanual-types.html#LanguageManualTypes-Timestamps 我使用的是 Hive 0.10。

架构:`创建外部表BT_PP(

>     Name STRING,

>     Application STRING,

>     PathId STRING,

>     StartTime TIMESTAMP,

>     Dimensions MAP<STRING, STRING>,

>     Values MAP<STRING, DOUBLE>,

>     Failed BOOLEAN,

>     VisitId BIGINT,

>     ResponseTime DOUBLE,

>     Duration DOUBLE,

>     CpuTime DOUBLE,

>     ExecTime DOUBLE,

>     SuspensionTime DOUBLE,

>     SyncTime DOUBLE,

>     WaitTime DOUBLE

> )

> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\;' ESCAPED BY '\\' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY '='`

示例原始行: Web Page Requests;xxx;PT\=115734\;PA\=314959848\;PS\=1725166795;1378315124.621;Complete Uri Path=/<...>;;false;;1616.58935546875;1616.58935546875;642.5269893486796;927.1752303076349;;;

询问: select * from bt_pp where datediff(from_unixtime(unix_timestamp()), startTime) < 2;

错误: 2013-09-09 12:36:25,687 INFO org.apache.hadoop.mapred.TaskStatus: task-diagnostic-info for task attempt_201308221633_0005_m_000001_3 : java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"name":"Web Page Requests","application":"xxx","pathid":"PT=115734;PA=314959848;PS=1725166795","starttime":"1969-12-31 19:00:00","dimensions":{"Complete Uri Path":"/<...>"},"values":{},"failed":false,"visitid":null,"responsetime":2189.27880859375,"duration":2189.27880859375,"cputime":353.1250106477223,"exectime":940.3325603696519,"suspensiontime":null,"synctime":null,"waittime":null} at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"name":"Web Page Requests","application":"xxx","pathid":"PT=115730;PA=314959848;PS=1725166795","starttime":"1969-12-31 19:00:00","dimensions":{"Complete Uri Path":"/<...>"},"values":{},"failed":false,"visitid":null,"responsetime":2189.27880859375,"duration":2189.27880859375,"cputime":353.1250106477223,"exectime":940.3325603696519,"suspensiontime":null,"synctime":null,"waittime":null} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:673) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:141) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating starttime at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:80) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:132) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:654) ... 9 more Caused by: java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff] at java.sql.Timestamp.valueOf(Timestamp.java:185) at org.apache.hadoop.hive.serde2.lazy.LazyTimestamp.init(LazyTimestamp.java:74) at org.apache.hadoop.hive.serde2.lazy.LazyStruct.uncheckedGetField(LazyStruct.java:219) at org.apache.hadoop.hive.serde2.lazy.LazyStruct.getField(LazyStruct.java:192) at org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldData(LazySimpleStructObjectInspector.java:188) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.evaluate(ExprNodeColumnEvaluator.java:98) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:76) ... 18 more

该错误引用了时间戳,但使用 Epoch (EST) 而不是行中的值。您可以在错误“Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluation starttime”中看到它肯定与该字段有关。

4

1 回答 1

0

根据https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Timestamps

文本文件中的时间戳必须使用格式 yyyy-mm-dd hh:mm:ss[.f...]。如果它们是另一种格式,则将它们声明为适当的类型(INT、FLOAT、STRING 等)并使用 UDF 将它们转换为时间戳。

因此,您不能简单地在 TIMESTAMP 字段中使用秒后浮点(或整数)值,您需要对其进行转换。

于 2013-12-05T12:50:40.787 回答