1

有人会通过 Nifi 1.3.0 和 Hive 帮助解决这个问题。我在 Hive 1.2 和 Hive 2.1.1 中遇到同样的错误。蜂巢表被分区分桶并存储为ORC格式。

分区是在 hdfs 上创建的,但数据在写入阶段失败。请检查以下日志:

[5:07 AM] papesdiop: Failed connecting to EndPoint {metaStoreUri='thrift://localhost:9083', database='mydb', table='guys', partitionVals=[dev] }
[5:13 AM] papesdiop: I get in log see next, hope it might help too:
[5:13 AM] papesdiop: Caused by: org.apache.hive.hcatalog.streaming.TransactionError: Unable to acquire lock on {metaStoreUri='thrift://localhost:9083', database='mydb', table='guys', partitionVals=[dev] }
  at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:578)

完整的跟踪日志:

重新连接。org.apache.thrift.transport.TTransportException: null at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) 在 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) 在 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin( TBinaryProtocol.java:219) 在 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) 在 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_lock(ThriftHiveMetastore.java:3906) 在 org .apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.lock(ThriftHiveMetastore.java:3893) 在 org.apache.hadoop.hive.metastore.HiveMetaStoreClient。Metastore 尝试使用 URI 连接到 Metastore thrift://localhost:9083 2017-09-07 06:41:31,893 INFO [Timer-Driven Process Thread-3] hive.metastore 已连接到 Metastore。2017-09-07 06:41:31,911 错误 [Timer-Driven Process Thread-3] oanprocessors.hive.PutHiveStreaming PutHiveStreaming[id=13ed53d2-015e-1000-c7b1-5af434c38751] 无法为端点创建 HiveWriter:{metaStoreUri=' thrift://localhost:9083', database='default', table='guys', partitionVals=[dev] }: org.apache.nifi.util.hive.HiveWriter$ConnectFailure: 无法连接到 EndPoint {metaStoreUri=' thrift://localhost:9083', database='default', table='guys', partitionVals=[dev] } org.apache.nifi.util.hive.HiveWriter$ConnectFailure: 连接到 EndPoint 失败 {metaStoreUri='thrift ://localhost:9083',数据库='默认',911 DEBUG [Timer-Driven Process Thread-3] oanprocessors.hive.PutHiveStreaming PutHiveStreaming[id=13ed53d2-015e-1000-c7b1-5af434c38751] 已选择让出其资源;不会安排再次运行 1000 毫秒 2017-09-07 06:41:31,912 错误 [Timer-Driven Process Thread-3] oanprocessors.hive.PutHiveStreaming PutHiveStreaming[id=13ed53d2-015e-1000-c7b1-5af434c38751] Hive流连接/写入错误,流文件将受到惩罚并路由重试。org.apache.nifi.util.hive.HiveWriter$ConnectFailure:无法连接到 EndPoint {metaStoreUri='thrift://localhost:9083',database='default',table='guys',partitionVals= 不会安排再次运行 1000 毫秒 2017-09-07 06:41:31,912 错误 [Timer-Driven Process Thread-3] oanprocessors.hive.PutHiveStreaming PutHiveStreaming[id=13ed53d2-015e-1000-c7b1-5af434c38751] Hive流连接/写入错误,流文件将受到惩罚并路由重试。org.apache.nifi.util.hive.HiveWriter$ConnectFailure:无法连接到 EndPoint {metaStoreUri='thrift://localhost:9083',database='default',table='guys',partitionVals= 不会安排再次运行 1000 毫秒 2017-09-07 06:41:31,912 错误 [Timer-Driven Process Thread-3] oanprocessors.hive.PutHiveStreaming PutHiveStreaming[id=13ed53d2-015e-1000-c7b1-5af434c38751] Hive流连接/写入错误,流文件将受到惩罚并路由重试。org.apache.nifi.util.hive.HiveWriter$ConnectFailure:无法连接到 EndPoint {metaStoreUri='thrift://localhost:9083',database='default',table='guys',partitionVals=

蜂巢表

CREATE TABLE mydb.guys(<code>firstname string, <code>lastname string) PARTITIONED BY (<code>job string) CLUSTERED BY (firstname) INTO 10 Bucket ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc。 OrcSerde' 存储为 ORC 位置 'hdfs://localhost:9000/user/papesdiop/guys' TBLPROPERTIES ('transactional'='true')

提前致谢

4

1 回答 1

0

如果在写入 HDFS 期间失败,也许您的用户没有写入目标目录的权限?如果您从完整的堆栈跟踪中获得更多信息,请将其添加到您的问题中,因为它有助于诊断问题。当我前段时间遇到这个问题时,这是因为我的 NiFi 用户需要在目标操作系统上创建并添加到适当的 HDFS 组中才能获得 PutHiveStreaming 写入 ORC 文件的权限在 HDFS 中。

于 2017-09-05T17:21:59.223 回答