java - Neo4j 嵌入式：从 2.3.9 升级到 3.2.3：initial_hosts 不相互通信

Question

我已经在 SINGLE 模式下将我的 neo4j 嵌入式数据库从 2.3.9 升级到 3.2.3，它已经成功升级。升级后，我启用了“HA”模式。在使用 3 个集群运行 neo4j 时，我面临以下问题。

个别服务器在 HA 模式下运行良好。（即 ha.initial_hosts = "ip_address_1:5101"），但如果我在 initial_hosts 下添加三台服务器（如配置所示），所有三台服务器都会立即停止。

我是否缺少任何配置？请建议。

配置：

neo4j {
            # Enable these two options while upgrading neo4j database.
            # dbms.allow_format_migration=true

            # or weak or strong
    cache_type = "weak"
            # Reduce the default page cache memory allocation
            dbms.memory.pagecache.size="6G"

            # Port to listen to for incoming backup requests.
            dbms.backup.address = ${local.private-ip}":6367"

            # Unique server id for this Neo4j instance
            # can not be negative id and must be unique
            ha.server_id="1"

            # List of other known instances in this cluster
            ha.initial_hosts = "ip_1:5101,ip_2:5101,ip_3:5101"

            # ha.initial_hosts = "ip_1:5101"
            # ha.cluster_server = ${local.private-ip}":5101"

            # IP and port for this instance to bind to for communicating cluster information
            # with the other neo4j instances in the cluster.
            ha.host.coordination = ${local.private-ip}":5101"

            # IP and port for this instance to bind to for communicating data with the
            # other neo4j instances in the cluster.
            ha.host.data = ${local.private-ip}":6365"

            # HA - High Availability
            # SINGLE - Single mode, default.
            dbms.mode="HA"

            # HTTP Connector
            dbms.connector.http.enabled="true"
            dbms.connector.http.listen_address=":7474"

            # Bolt connector
            dbms.connector.bolt.enabled="true"
            dbms.connector.bolt.tls_level="OPTIONAL"
            dbms.connector.bolt.listen_address=":7689"
}

从 neo4j debug.log：

2017-10-09 12:35:47.153+0000 ERROR [o.n.k.h.c.m.HighAvailabilityModeSwitcher] Error while trying to switch to slave Cannot find the master among [] with master serverId=1 and uri=ha://ip_address_1:6365?serverId=1
    java.lang.IllegalStateException: Cannot find the master among [] with master serverId=1 and uri=ha://ip_address_1:6365?serverId=1
            at org.neo4j.kernel.ha.cluster.SwitchToSlave.checkMyStoreIdAndMastersStoreId(SwitchToSlave.java:263)
            at org.neo4j.kernel.ha.cluster.SwitchToSlaveBranchThenCopy.checkDataConsistency(SwitchToSlaveBranchThenCopy.java:142)
            at org.neo4j.kernel.ha.cluster.SwitchToSlave.executeConsistencyChecks(SwitchToSlave.java:478)
            at org.neo4j.kernel.ha.cluster.SwitchToSlave.switchToSlave(SwitchToSlave.java:221)
            at org.neo4j.kernel.ha.cluster.modeswitch.HighAvailabilityModeSwitcher$1.run(HighAvailabilityModeSwitcher.java:355)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
            at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            at java.lang.Thread.run(Thread.java:745)
            at org.neo4j.helpers.NamedThreadFactory$2.run(NamedThreadFactory.java:109)
    2017-10-09 12:35:47.154+0000 INFO [o.n.k.h.c.m.HighAvailabilityModeSwitcher] Attempting to switch to slave in 300s

score 0 · Accepted Answer

join_timeout的默认值为 30 秒。

加入集群超时。默认为 ha.broadcast_timeout。请注意，如果在集群形成期间超时到期，则操作员可能必须重新启动一个或多个实例。

ha.join_timeout=10m

https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/#config_ha.join_timeout

java - Neo4j 嵌入式：从 2.3.9 升级到 3.2.3：initial_hosts 不相互通信

1 回答 1

加入集群超时。默认为 ha.broadcast_timeout。请注意，如果在集群形成期间超时到期，则操作员可能必须重新启动一个或多个实例。

Related

Reference