What exactly is the zookeeper quorum setting in hbase-site.xml?
2 回答
如hbase-default.xml 中所述,设置如下:
ZooKeeper Quorum 中以逗号分隔的服务器列表。例如,“host1.mydomain.com,host2.mydomain.com,host3.mydomain.com”。默认情况下,本地和伪分布式操作模式设置为 localhost。对于完全分布式的设置,这应该设置为 ZooKeeper 仲裁服务器的完整列表。如果 HBASE_MANAGES_ZK 在 hbase-env.sh 中设置,这是我们将启动/停止 ZooKeeper 的服务器列表。
这实际上是由 Edward J. Yoon 回答的。为了清楚起见,我进行了编辑:
Apache Zookeeper 是分布式应用程序的协调服务,例如 Google 的 Chubby。很多项目使用zookeeper,我们(Apache Hama)也使用zookeeper做Bulk Synchronous Parallel计算框架的屏障同步。
今天,我对 Zookeeper 项目的 paxos 和动态 quorum 特性进行了更多调查,以便更好地命名该类
org.apache.hama.zookeeper.QuorumPeer
。由于文档不够(http://hadoop.apache.org/zookeeper/docs/r3.0.0/api/index.html),我不明白“quorum”的含义,因为这个词有点奇怪大部头书。但是,“org.apache.hama.zookeeper.QuorumPeer”是正确的名称!xD那么,什么是法定人数,为什么我们需要法定人数?
根据维基百科,法定人数是进行该团体业务所需的审议机构的最低成员人数。通常,这是预计出席会议的大多数人,尽管许多团体的法定人数可能较低或较高。
众所周知,容错机制是分布式系统的重要功能之一。Quorum 算法用于防止脑裂情况。当出现脑裂情况时,zookeeper 根据 Quorum 算法确定“Primary Partition”和“Secondary Partition”。然后,primary 组中的服务器接收并处理用户的请求,secondary 组中的服务器变为只读状态。
该系统何时从裂脑状态中恢复?当它们再次合并到一个分区时。在内部,zookeeper 使用原子广播协议而不是 Paxos。
您还应该阅读原始版本,以防我误译了他试图呈现的概念。
我对Apache Zookeeper中的仲裁机制的理解是它明确定义了跨多个预定义主机的复制仲裁。如果未达到此法定人数,则不同意的分区将被拆分到辅助分区,直到 Zookeeper 可以将它们与主分区重新集成。
这为 Hadoop 的最终一致性模型增加了更多的粒度。与此同时,HBase 目前正在进一步将 Zookeeper 与其代码集成。
From the hbase-default.xml file:
Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which we will start/stop ZooKeeper on.
And from the Getting Started's Requirements section:
HBase depends on ZooKeeper as of release 0.20.0. HBase keeps the location of its root table, who the current master is, and what regions are currently participating in the cluster in ZooKeeper. Clients and Servers now must know their ZooKeeper Quorum locations before they can do anything else (Usually they pick up this information from configuration supplied on their CLASSPATH). By default, HBase will manage a single ZooKeeper instance for you. In standalone and pseudo-distributed modes this is usually enough, but for fully-distributed mode you should configure a ZooKeeper quorum (more info below).
Hope that helps.