手动切换REPMGR节点的步骤和注意事项

原创于 2025-11-20 16:32:59 发布 · 431 阅读

7 ·

CC 4.0 BY-SA版权

文章标签：

#瀚高数据库

文章目录

文档用途
详细信息

文档用途

介绍repmgr手动主备切换的方法，以及使用过程中的注意事项。

详细信息

具体步骤：

repmgr集群主备切换使用的是repmgr standby switchover工具，它的作用是将备库提升为主库，所以该命令只能在备库上执行。

首先检查一下repmgr集群状态

repmgr cluster show
 ID | Name           | Role    | Status    | Upstream       | Location | Priority | Replication lag | Last replayed LSN
----+----------------+---------+-----------+----------------+----------+----------+-----------------+-------------------
 1  | x.x.80.126 | standby |   running | x.x.80.127 | default  | 100      | 0 bytes         | 0/3030F48
 2  | x.x.80.127 | primary | * running |                | default  | 100      | n/a             | none

然后在备库上执行repmgr standby switchover

repmgr standby switchover
WARNING: hg_diag/disabl: unknown name/value pair provided; ignoring
NOTICE: executing switchover on node "x.x.80.126" (ID: 1)
WARNING: repmgrd not running on node x.x.80.127 (ID 2)
NOTICE: local node "x.x.80.126" (ID: 1) will be promoted to primary; current primary "x.x.80.127" (ID: 2) will be demoted to standby
NOTICE: stopping current primary node "x.x.80.127" (ID: 2)
NOTICE: issuing CHECKPOINT
DETAIL: executing server command "/opt/HighGo5.6.5-cluster/bin/pg_ctl  -D '/opt/HighGo5.6.5-cluster/data' -W -m fast stop"
INFO: checking for primary shutdown; 2 of 60 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 0/3059688
NOTICE: promoting standby to primary
DETAIL: promoting server "x.x.80.126" (ID: 1) using "/opt/HighGo5.6.5-cluster/bin/pg_ctl  -w -D '/opt/HighGo5.6.5-cluster/data' promote"
waiting for server to promote....2020-03-30 12:07:15.425 CST [1899] 日志:  00000: 接收到提或请求
2020-03-30 12:07:15.426 CST [1899] 日志:  00000: redo 在 0/3059688 完成
2020-03-30 12:07:15.426 CST [1899] 日志:  00000: 上一次完成事务是在日志时间2020-03-30 11:38:47.85602+08完成的.
2020-03-30 12:07:15.430 CST [1899] 日志:  00000: 已选择的新时间线ID：2
2020-03-30 12:07:15.955 CST [1899] 日志:  00000: 归档恢复完毕
2020-03-30 12:07:15.992 CST [1895] 日志:  00000: 数据库系统准备接受连接
 完成
server promoted
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server "x.x.80.126" (ID: 1) was successfully promoted to primary
INFO: local node 2 can attach to rejoin target node 1
DETAIL: local node's recovery point: 0/3059688; rejoin target node's fork point: 0/30596F8
NOTICE: setting node 2's slot name to "repmgr_slot_2"
NOTICE: setting node 2's upstream to node 1
WARNING: unable to ping "host=x.x.80.127 user=highgo password=pdjn300 dbname=highgo port=5866 connect_timeout=2"
DETAIL: PQping() returned "PQPING_NO_RESPONSE"
NOTICE: starting server using "/opt/HighGo5.6.5-cluster/bin/pg_ctl  -w -D '/opt/HighGo5.6.5-cluster/data' start"
NOTICE: NODE REJOIN successful
DETAIL: node 2 is now attached to node 1
NOTICE: switchover was successful
DETAIL: node "x.x.80.126" is now primary and node "x.x.80.127" is attached as standby
NOTICE: STANDBY SWITCHOVER has completed successfully

然后再检查一下状态

repmgr cluster show
 ID | Name           | Role    | Status    | Upstream       | Location | Priority | Replication lag | Last replayed LSN
----+----------------+---------+-----------+----------------+----------+----------+-----------------+-------------------
 1  | x.x.80.126 | primary | * running |                | default  | 100      | n/a             | none
 2  | x.x.80.127 | standby |   running | x.x.80.126 | default  | 100      | 0 bytes         | 0/305BBC0

可以发现数据库的主备已经切换完毕

注意事项：

1.实现主备切换需要进行互信操作，不然会报错

ERROR: unable to connect via SSH to host “x.x.80.127”, user “”

2.在切换时 repmgrd 不应该在 repmgr.conf中启用 failover=automatic，否则 repmgrd守护进程可能尝试并自行提升备用数据库。

3.数据库的环境变量应该设置在.bashrc中否则会报错找不到repmgr

ERROR: unable to execute “repmgr” on “x.x.80.127”
HINT: check “pg_bindir” is set to the correct path in “repmgr.conf”; current value: “/opt/HighGo5.6.5-cluster/bin”

4.建议在切换之前加上–dry-run，可以测试是否切换成功而不进行切换，测试成功后再进行正式切换。