我使用以下命令在 Hive 中创建了一个表:
CREATE TABLE tweet_table(
tweet STRING
)
ROW FORMAT
DELIMITED
FIELDS TERMINATED BY '\n'
LINES TERMINATED BY '\n'
我插入一些数据:
LOAD DATA LOCAL INPATH 'data.txt' INTO TABLE tweet_table
数据.txt:
data1
data2
data3data4
data5
命令select * from tweet_table
返回:
data1
data2
data3data4
data5
但是select tweet from tweet_table
给了我:
java.lang.RuntimeException: java.lang.ArrayIndexOutOfBoundsException: 0
at org.apache.hadoop.hive.ql.exec.Utilities.getMapRedWork(Utilities.java:230)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:381)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:374)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:540)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:338)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
at java.beans.XMLDecoder.readObject(XMLDecoder.java:250)
at org.apache.hadoop.hive.ql.exec.Utilities.deserializeMapRedWork(Utilities.java:542)
at org.apache.hadoop.hive.ql.exec.Utilities.getMapRedWork(Utilities.java:222)
... 7 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
就像数据存储在正确的表中,而不是存储在tweet
字段中一样,为什么?