我在我的应用程序中使用 Apache Curator Leader Election Recipe:https ://curator.apache.org/curator-recipes/leader-election.html 。
Zookeeper 版本:3.5.7 策展人:4.0.1
以下是步骤顺序: 1. 每当我的 tomcat 服务器实例启动时,我创建一个 CuratorFramework 实例(每个 tomcat 服务器一个实例)并启动它:
CuratorFramework client = CuratorFrameworkFactory.newClient(connectionString, retryPolicy);
client.start();
if(!client.blockUntilConnected(10, TimeUnit.MINUTES)){
LOGGER.error("Zookeeper connection could not establish!");
throw new RuntimeException("Zookeeper connection could not establish");
}
- 创建一个 LSAdapter 的实例并启动它:
LSAdapter adapter = new LSAdapter(client, <some_metadata>);
adapter.start();
下面是我的 LSAdapter 类:
public class LSAdapter extends LeaderSelectorListenerAdapter implements Closeable {
//<Class instance variables defined>
public LSAdapter(CuratorFramework client, <some_metadata>) {
leaderSelector = new LeaderSelector(client, <path_to_be_used_for_leader_election>, this);
leaderSelector.autoRequeue();
}
public void start() throws IOException {
leaderSelector.start();
}
@Override
public void close() throws IOException {
leaderSelector.close();
}
@Override
public void takeLeadership(CuratorFramework client) throws Exception {
final int waitSeconds = (int) (5 * Math.random()) + 1;
LOGGER.info(name + " is now the leader. Waiting " + waitSeconds + " seconds...");
LOGGER.debug(name + " has been leader " + leaderCount.getAndIncrement() + " time(s) before.");
while (true) {
try {
Thread.sleep(TimeUnit.SECONDS.toMillis(waitSeconds));
//do leader tasks
} catch (InterruptedException e) {
LOGGER.error(name + " was interrupted.");
//cleanup
Thread.currentThread().interrupt();
} finally {
}
}
}
}
- 当服务器实例关闭时,关闭 LSAdapter 实例(正在使用的应用程序)并关闭创建的 CuratorFramework 客户端
CloseableUtils.closeQuietly(lsAdapter);
curatorFrameworkClient.close();
我面临的问题是,有时当服务器重新启动时,没有领导者被选举出来。我通过跟踪 takeLeadership() 中的日志来检查这一点。我有两个带有上述代码的 tomcat 服务器实例,连接到同一个 zookeeper quorum,并且大多数情况下,其中一个实例成为领导者,但是当这个问题发生时,它们都成为追随者。请建议我做错了什么。