我在用着:
Cloudera Manager Free Edition: 4.5.1
Cloudera Hadoop Distro: CDH 4.2.0-1.cdh4.2.0.p0.10 (Parcel)
Hive Metastore with cloudera manager embedded PostgreSQL database.
我的 cloudera 管理器在单独的机器上运行,它不是集群的一部分。
使用 cloudera manager 设置集群后,我开始通过 hue + beeswax 使用 hive。
一切都运行良好了一段时间,然后突然间,每当我对具有大量分区(大约 14000)的特定表运行任何查询时,查询开始超时:
FAILED: SemanticException org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
当我注意到这一点时,我查看了日志,发现与 Hive Metastore 的连接超时:
WARN metastore.RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect. org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
看到这里,我认为 hive 元存储有问题。所以我查看了 hive 元存储的日志并发现了 java.lang.OutOfMemoryErrors:
/var/log/hive/hadoop-cmf-hive1-HIVEMETASTORE-hci-cdh01.hcinsight.net.log.out:
2013-05-07 14:13:08,744 ERROR org.apache.thrift.ProcessFunction: Internal error processing get_partitions_ with_auth
java.lang.OutOfMemoryError: Java heap space
at sun.reflectH.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.jav a:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
at org.datanucleus.util.ClassUtils.newInstance(ClassUtils.java:95)
at org.datanucleus.store.rdbms.sql.expression.SQLExpressionFactory.newLiteralParameter(SQLExpressi onFactory.java:248)
at org.datanucleus.store.rdbms.scostore.RDBMSMapEntrySetStore.getSQLStatementForIterator(RDBMSMapE ntrySetStore.java:323)
at org.datanucleus.store.rdbms.scostore.RDBMSMapEntrySetStore.iterator(RDBMSMapEntrySetStore.java: 221)
at org.datanucleus.sco.SCOUtils.populateMapDelegateWithStoreData(SCOUtils.java:987)
at org.datanucleus.sco.backed.Map.loadFromStore(Map.java:258)
at org.datanucleus.sco.backed.Map.keySet(Map.java:509)
at org.datanucleus.store.fieldmanager.LoadFieldManager.internalFetchObjectField(LoadFieldManager.j ava:118)
at org.datanucleus.store.fieldmanager.AbstractFetchFieldManager.fetchObjectField(AbstractFetchFiel dManager.java:114)
at org.datanucleus.state.AbstractStateManager.replacingObjectField(AbstractStateManager.java:1183)
at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoReplaceField(MStorageDescriptor.ja va)
at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoReplaceFields(MStorageDescriptor.j ava)
at org.datanucleus.jdo.state.JDOStateManagerImpl.replaceFields(JDOStateManagerImpl.java:2860)
at org.datanucleus.jdo.state.JDOStateManagerImpl.replaceFields(JDOStateManagerImpl.java:2879)
at org.datanucleus.jdo.state.JDOStateManagerImpl.loadFieldsInFetchPlan(JDOStateManagerImpl.java:16 47)
at org.datanucleus.store.fieldmanager.LoadFieldManager.processPersistable(LoadFieldManager.java:63 )
at org.datanucleus.store.fieldmanager.LoadFieldManager.internalFetchObjectField(LoadFieldManager.j ava:84)
at org.datanucleus.store.fieldmanager.AbstractFetchFieldManager.fetchObjectField(AbstractFetchFiel dManager.java:104)
at org.datanucleus.state.AbstractStateManager.replacingObjectField(AbstractStateManager.java:1183)
at org.apache.hadoop.hive.metastore.model.MPartition.jdoReplaceField(MPartition.java)
at org.apache.hadoop.hive.metastore.model.MPartition.jdoReplaceFields(MPartition.java)
at org.datanucleus.jdo.state.JDOStateManagerImpl.replaceFields(JDOStateManagerImpl.java:2860)
at org.datanucleus.jdo.state.JDOStateManagerImpl.replaceFields(JDOStateManagerImpl.java:2879)
at org.datanucleus.jdo.state.JDOStateManagerImpl.loadFieldsInFetchPlan(JDOStateManagerImpl.java:16 47)
at org.datanucleus.ObjectManagerImpl.performDetachAllOnTxnEndPreparation(ObjectManagerImpl.java:35 52)
at org.datanucleus.ObjectManagerImpl.preCommit(ObjectManagerImpl.java:3291)
at org.datanucleus.TransactionImpl.internalPreCommit(TransactionImpl.java:369)
at org.datanucleus.TransactionImpl.commit(TransactionImpl.java:256)
此时,配置单元元存储关闭并重新启动:
2013-05-07 14:39:40,576 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: Shutting down hive metastore.
2013-05-07 14:41:09,979 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: Starting hive metastore on po rt 9083
现在,为了解决这个问题,我更改了配置单元元存储服务器和蜂蜡服务器的最大堆大小:
1. Hive/Hive Metastore Server(Base)/Resource Management/Java Heap Size of Metastore Server : 2 GiB (First thing I did.)
2. Hue/Beeswax Server(Base)/Resource Management/Java Heap Size of Beeswax Server : 2 GiB (After reading some groups posts and stuff online, I tried this as well.)
当我继续在 hive 元存储日志中看到 OOME 时,上述 2 个步骤似乎都没有帮助。
然后我注意到实际的元存储“数据库”正在作为我的 cloudera 管理器的一部分运行,我想知道 PostgreSQL 进程是否内存不足。我寻找增加该进程的 java 堆大小的方法,但发现的文档很少。
我想知道你们中的一个人是否可以帮助我解决这个问题。
我应该增加嵌入式数据库的 java 堆大小吗?如果是这样,我会在哪里做这个?
还有什么我想念的吗?
谢谢!