1

我已经启动了 hbase 并且所有的守护进程都在运行。

 $ jps
8482 HQuorumPeer
25105 RemoteMavenServer
9133 SecondaryNameNode
11883 HRegionServer
13793 Jps
8545 NameNode
8572 HMaster
11519 Main
25029 Main
8851 DataNode
9435 RunJar

现在让我们尝试列出这些表:

hbase(main):004:0* list
        TABLE                                                                                                                                                   

ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 times

Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:

主日志尾部:

2013-05-17 22:48:35,609 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=localhost,60020,1368856115352

Zookeeper 日志的尾部:

$ tail *zoo*.log
2013-05-18 00:14:27,651 INFO org.apache.zookeeper.server.NIOServerCnxnFactory: Accepted socket connection from /127.0.0.1:49826
2013-05-18 00:14:27,652 INFO org.apache.zookeeper.server.ZooKeeperServer: Client attempting to establish new session at /127.0.0.1:49826
2013-05-18 00:14:27,666 INFO org.apache.zookeeper.server.ZooKeeperServer: Established session 0x13eb59ceb22001e with negotiated timeout 180000 for client /127.0.0.1:49826

regionserver 日志的尾部:

2013-05-18 00:08:35,416 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=2.03 MB, free=244.85 MB, max=246.88 MB, blocks=0, accesses=0, hits=0, hitRatio=0cachingAccesses=0, cachingHits=0, cachingHitsRatio=0evictions=0, evicted=0, evictedPerRun=NaN
2013-05-18 00:13:35,416 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=2.03 MB, free=244.85 MB, max=246.88 MB, blocks=0, accesses=0, hits=0, hitRatio=0cachingAccesses=0, cachingHits=0, cachingHitsRatio=0evictions=0, evicted=0, evictedPerRun=NaN
2013-05-18 00:18:35,416 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=2.03 MB, free=244.85 MB, max=246.88 MB, blocks=0, accesses=0, hits=0, hitRatio=0cachingAccesses=0, cachingHits=0, cachingHitsRatio=0evictions=0, evicted=0, evictedPerRun=NaN

更多细节(回应下面的@roman)。安全模式已经关闭。

fsck 给出:

hadoop fsck /

.Status: HEALTHY
 Total size:    321466989 B
 Total dirs:    412
 Total files:   446
 Total blocks (validated):  355 (avg. block size 905540 B)
 Minimally replicated blocks:   355 (100.0 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:   334 (94.08451 %)
 Mis-replicated blocks:     0 (0.0 %)
 Default replication factor:    3
 Average block replication: 1.0
 Corrupt blocks:        0
 Missing replicas:      1109 (312.39438 %)
 Number of data-nodes:      1
 Number of racks:       1
FSCK ended at Sun May 19 13:09:14 PDT 2013 in 147 milliseconds

但是,正如您怀疑的那样,hbase gui 没有在 60030 上运行。我在 hbase 日志中没有看到错误来解释原因。

更多信息@roman:hbase hbck 只是因 MasterNotRunningException 而超时

stephenb@gondolin:/shared$ hbase hbck 
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.3-1240972, built on 02/06/2012 10:48 GMT
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:host.name=gondolin
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:java.version=1.6.0_37
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc.
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:java.home=/shared/jdk1.6.0_37/jre
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/shared/hadoop-1.0.3/libexec/../lib/native/Linux-amd64-64:/shared/hbase/lib/native/Linux-amd64-64
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:os.version=3.2.0-39-generic
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:user.name=stephenb
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/stephenb
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:user.dir=/shared
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
  13/05/19 13:16:16 INFO zookeeper.ClientCnxn: Opening socket connection to server /127.0.0.1:2181
  13/05/19 13:16:16 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 24642@gondolin
  13/05/19 13:16:16 WARN client.ZooKeeperSaslClient: SecurityException: java.lang.SecurityException: Unable to locate a login configuration occurred when trying to find JAAS configuration.
  13/05/19 13:16:16 INFO client.ZooKeeperSaslClient: Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
  13/05/19 13:16:16 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
  13/05/19 13:16:16 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x13eb59ceb22002f, negotiated timeout = 180000
  13/05/19 13:17:27 INFO client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x13eb59ceb22002f
  13/05/19 13:17:27 INFO zookeeper.ZooKeeper: Session: 0x13eb59ceb22002f closed
  13/05/19 13:17:27 INFO zookeeper.ClientCnxn: EventThread shut down
  13/05/19 13:17:27 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
  13/05/19 13:17:27 INFO zookeeper.ClientCnxn: Opening socket connection to server /127.0.0.1:2181
  13/05/19 13:17:27 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 24642@gondolin
  13/05/19 13:17:27 WARN client.ZooKeeperSaslClient: SecurityException: java.lang.SecurityException: Unable to locate a login configuration occurred when trying to find JAAS configuration.
  13/05/19 13:17:27 INFO client.ZooKeeperSaslClient: Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
  13/05/19 13:17:27 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
  13/05/19 13:17:27 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x13eb59ceb220030, negotiated timeout = 180000
  13/05/19 13:18:39 INFO client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x13eb59ceb220030
  13/05/19 13:18:39 INFO zookeeper.ZooKeeper: Session: 0x13eb59ceb220030 closed
  13/05/19 13:18:39 INFO zookeeper.ClientCnxn: EventThread shut down
  13/05/19 13:18:39 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
  13/05/19 13:18:39 INFO zookeeper.ClientCnxn: Opening socket connection to server /127.0.0.1:2181
  13/05/19 13:18:39 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 24642@gondolin
  13/05/19 13:18:39 WARN client.ZooKeeperSaslClient: SecurityException: java.lang.SecurityException: Unable to locate a login configuration occurred when trying to find JAAS configuration.
  13/05/19 13:18:39 INFO client.ZooKeeperSaslClient: Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
  13/05/19 13:18:39 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
  13/05/19 13:18:39 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x13eb59ceb220031, negotiated timeout = 180000
  13/05/19 13:18:51 DEBUG client.HConnectionManager$HConnectionImplementation: The connection to null was closed by the finalize method.
  13/05/19 13:18:51 DEBUG client.HConnectionManager$HConnectionImplementation: 
  13/05/19 13:29:18 INFO client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x13eb59ceb220039
    13/05/19 13:29:18 INFO zookeeper.ZooKeeper: Session: 0x13eb59ceb220039 closed
    13/05/19 13:29:18 INFO zookeeper.ClientCnxn: EventThread shut down
    Exception in thread "main" org.apache.hadoop.hbase.MasterNotRunningException: Retried 10 times
        at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:130)
        at org.apache.hadoop.hbase.util.HBaseFsck.connect(HBaseFsck.java:264)
        at org.apache.hadoop.hbase.util.HBaseFsck.exec(HBaseFsck.java:3331)
        at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3192)
4

1 回答 1

1

而且 HBase Web UI 没有运行,是吗?在单节点伪分布式集群完全崩溃后,我遇到了类似的情况。HDFS 无法退出安全模式。

  1. 使用 .检查 HDFS 未处于安全模式hadoop dfsadmin -safemode get
  2. 如果是这样,请手动强制退出安全模式hadoop dfsadmin -safemode leave
  3. 您应该会看到进度 - 至少 HBase Web UI 应该是可见的。
  4. 执行 HDFS fsck: hadoop fsck / -move
  5. 好的,如果一切顺利,最好进行hbase hbck检查。

您可能需要的其他提示:

  • 检查区域服务器绑定的位置netstat -n -a(检查配置中的端口)。碰巧它绑定在错误的接口上。也请搜索论坛 - Hadoop 绑定和 IPv6 存在问题(例如检查这个)。
  • 检查 hadoop 是否真的退出了安全模式hadoop dfsadmin -safemode get。HBase 在完成之前不会完全启动。
于 2013-05-19T19:07:23.333 回答