我已经创建了一个 Redis 集群,如下所示。
xxx.xxx.xxx.195:9100 xxx.xxx.xxx.196:9100 xxx.xxx.xxx.197:9100
xxx.xxx.xxx.195:9200 xxx.xxx.xxx.196:9200 xxx.xxx.xxx.197:9200
我经历过,当我同时停止 2 个作为集群主控的 Redis 实例作为(xxx.xxx.xxx.196 的 2 个实例)时,集群无法恢复,
xxx.xxx.xxx.195:9100 (Master) xxx.xxx.xxx.196:9100 (Master) xxx.xxx.xxx.197:9100 (Slave)
xxx.xxx.xxx.195:9200 (Slave) xxx.xxx.xxx.196:9200 (Master) xxx.xxx.xxx.197:9200 (Slave)
但同时,如果我停止 .195 服务器的 2 个实例 where9100 -Master
和9200 - Slave
. 集群恢复并正常工作
集群配置文件:
protected-mode no
activerehashing yes
cluster-enabled yes
cluster-config-file /opt/redis/conf/nodes9100.conf
cluster-slave-validity-factor 0
cluster-node-timeout 5000
appendonly yes
Redis 登录专用从服务器:
28939:S 09 Oct 16:08:32.834 - 0 clients connected (0 slaves), 1327200 bytes in use
28939:S 09 Oct 16:08:32.834 * Connecting to MASTER xxx.xxx.xxx.196:9200
28939:S 09 Oct 16:08:32.835 * MASTER <-> SLAVE sync started
28939:S 09 Oct 16:08:32.835 # Error condition on socket for SYNC: Connection refused
28939:S 09 Oct 16:08:33.837 * Connecting to MASTER xxx.xxx.xxx.196:9200
28939:S 09 Oct 16:08:33.837 * MASTER <-> SLAVE sync started
28939:S 09 Oct 16:08:33.837 # Error condition on socket for SYNC: Connection refused
28939:S 09 Oct 16:08:34.839 * Connecting to MASTER xxx.xxx.xxx.196:9200
28939:S 09 Oct 16:08:34.839 * MASTER <-> SLAVE sync started
28939:S 09 Oct 16:08:34.839 # Error condition on socket for SYNC: Connection refused
28939:S 09 Oct 16:08:35.840 * Connecting to MASTER xxx.xxx.xxx.196:9200
28939:S 09 Oct 16:08:35.840 * MASTER <-> SLAVE sync started
28939:S 09 Oct 16:08:35.840 # Error condition on socket for SYNC: Connection refused
28939:S 09 Oct 16:08:36.744 - Node 982d9b0a50b393d5fe604caefc0acaae68547648 reported node b57d59fb5685daeaac7e249d99fa257e9be66f4f as not reachable.
28939:S 09 Oct 16:08:36.844 * Connecting to MASTER xxx.xxx.xxx.196:9200
28939:S 09 Oct 16:08:36.844 * MASTER <-> SLAVE sync started
28939:S 09 Oct 16:08:36.844 # Error condition on socket for SYNC: Connection refused