0

使用动态分区将数据插入 Hive 表时遇到问题。

我有一个包含一个普通列和一个分区列的现有表,我正在尝试将数据插入这些列。我的代码:

// Preparing writer
WriteEntity.Builder builder = new WriteEntity.Builder();
WriteEntity entity = builder.withDatabase(DATABASE_NAME).withTable(TABLE_NAME).withPartition(null).build();
HCatWriter masterHCatWriter = DataTransferFactory.getHCatWriter(entity, CUSTOM_CONFIG);
WriterContext writerContext = masterHCatWriter.prepareWrite();
HCatWriter hCatWriter = DataTransferFactory.getHCatWriter(writerContext);

// Preparing record to be written
List<HCatRecord> hCatRecordsBatch = new ArrayList<HCatRecord>();
HCatRecord hCatRecord = new DefaultHCatRecord(2);
hCatRecord.set(0, "aaa");
hCatRecord.set(1, "bbb");
hCatRecordsBatch.add(hCatRecord);

// Writing record
hCatWriter.write(hCatRecordsBatch.iterator());

但我得到了例外:

org.apache.hive.hcatalog.common.HCatException : 9001 : Exception occurred while processing HCat request : Failed while writing. Cause : org.apache.hive.hcatalog.common.HCatException : 2010 : Invalid partition values specified : Unable to configure dynamic partitioning for storage handler, mismatch between number of partition values obtained[0] and number of partition values required[1]
at org.apache.hive.hcatalog.data.transfer.impl.HCatOutputFormatWriter.write(HCatOutputFormatWriter.java:112)
at ...private classes...
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hive.hcatalog.common.HCatException : 2010 : Invalid partition values specified : Unable to configure dynamic partitioning for storage handler, mismatch between number of partition values obtained[0] and number of partition values required[1]
at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.configureOutputStorageHandler(HCatBaseOutputFormat.java:156)
at org.apache.hive.hcatalog.mapreduce.FileRecordWriterContainer.configureDynamicStorageHandler(FileRecordWriterContainer.java:264)
at org.apache.hive.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:183)
at org.apache.hive.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
at org.apache.hive.hcatalog.data.transfer.impl.HCatOutputFormatWriter.write(HCatOutputFormatWriter.java:98)
... 8 more

我浏览了 hive 库的代码,看起来prepareWrite()在主节点上调用的方法得到了错误的模式。它仅使用普通列加载架构(缺少分区列),之后,无法检索分区列的插入记录中的值(实际上在异常中说...number of partition values obtained[0]...)。有同样的问题所以问题,但在我的情况下,我不能将列附加到模式,因为它被打包在prepareWrite()方法中。

我正在使用 Cloudera 版本 5.3.2 的库(这意味着 Hive 版本 0.13.1)

我将不胜感激。谢谢。

4

0 回答 0