我的Hive
. 它只有一个分区:
show partitions hive_test;
OK
pt=20130805000000
Time taken: 0.124 seconds
但是当我执行一个简单的查询 sql 时,结果发现文件夹下的数据文件20130805000000
。为什么它不只使用文件20130805000000
?
sql:
SELECT buyer_id AS USER_ID from hive_test limit 1;
这是一个例外:
java.io.IOException: /group/myhive/test/hive/hive_test/pt=20130101000000/data
doesn't exist!
at org.apache.hadoop.hdfs.DFSClient.listPathWithLocations(DFSClient.java:1045)
at org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:352)
at org.apache.hadoop.fs.viewfs.ChRootedFileSystem.listLocatedStatus(ChRootedFileSystem.java:270)
at org.apache.hadoop.fs.viewfs.ViewFileSystem.listLocatedStatus(ViewFileSystem.java:851)
at org.apache.hadoop.hdfs.Yunti3FileSystem.listLocatedStatus(Yunti3FileSystem.java:349)
at org.apache.hadoop.mapred.SequenceFileInputFormat.listLocatedStatus(SequenceFileInputFormat.java:49)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:242)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:261)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1238)
我的问题是为什么配置单元会尝试查找文件“ /group/myhive/test/hive/hive_test/pt=20130101000000/data ”,而不是“ /group/myhive/test/hive/hive_test/pt=20130101000000/ ”?