postgresql - 自动故障转移在 repmgr 中不起作用

Question

我已经成功设置了两个节点（Master & Satndby）。我的版本是 repmgr 2.0 (PostgreSQL 9.3.6)

备用repmgr.conf

cluster=test
node=2
node_name=node2
conninfo='host=192.168.1.218 user=repmgr_usr dbname=repmgr_db'
pg_bindir='/usr/lib/postgresql/9.3/bin'

master_response_timeout=30  
reconnect_attempts=2  
reconnect_interval=10  
failover=automatic

主备repmgr.conf

cluster=test
node=1
node_name=master
conninfo='host=192.168.1.205 user=repmgr_usr dbname=repmgr_db'
pg_bindir=/usr/lib/postgresql/9.3/bin
master_response_timeout=30   
reconnect_attempts=2 
reconnect_interval=10
failover=automatic
promote_command='/etc/repmgr/auto_failover.sh'

当我停止备用节点（Postgressql 服务）时，我得到了以下 repmgrd 日志文件：

[WARNING] repmgrd: Connection to standby has been lost, trying to recover... 20 seconds before failover decision
[2015-04-02 20:47:43] [WARNING] repmgrd: Connection to standby has been lost, trying to recover... 10 seconds before failover decision
[2015-04-02 20:47:53] [ERROR] repmgrd: We couldn't reconnect for long enough, exiting...
[2015-04-02 20:47:53] [ERROR] Failed to connect to local node, exiting!

不执行脚本...请帮助我...

score 1 · Accepted Answer

为了执行脚本，您需要停止主节点而不是备用节点，因为只有在主节点出现故障时才会发生故障转移。

同样在您的 postgresql 配置文件中/etc/postgresql/9.3/main/postgresql.conf添加shared_preload_libraries = 'repmgr_funcs'.

并在您的/etc/repmgr/repmgr.conf文件中添加以下行：

promote_command='repmgr standby promote -f /etc/repmgr/repmgr.conf'
follow_command='repmgr standby follow -f /etc/repmgr/repmgr.conf'

为了更加确定，repmgrd通过 exectuing 检查是否正在运行ps aux | grep -i rep。

希望它有帮助，
最好的问候

postgresql - 自动故障转移在 repmgr 中不起作用

1 回答 1

Related

Reference