1.查看复制槽
repmgr=# SELECT node_id, upstream_node_id, active, node_name, type, priority, slot_name
repmgr-# FROM repmgr.nodes ORDER BY node_id;
node_id | upstream_node_id | active | node_name | type | priority | slot_name
---------+------------------+--------+-----------+---------+----------+---------------
1 | | t | pg1 | primary | 100 | repmgr_slot_1
2 | 1 | t | pg2 | standby | 100 | repmgr_slot_2
3 | 1 | t | pg3 | standby | 100 | repmgr_slot_3
4 | 1 | t | pg4 | witness | 0 | repmgr_slot_4
(4 rows)
repmgr=# SELECT slot_name, slot_type, active, active_pid FROM pg_replication_slots ;
slot_name | slot_type | active | active_pid
---------------+-----------+--------+------------
repmgr_slot_2 | physical | t | 5891
repmgr_slot_3 | physical | t | 6371
(2 rows)
2.将第一个从库切换为主库
(1)切换前检查状态
[pgsql@pg2:/home/pgsql]$repmgr -f /postgresql/app/postgresql/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+------+---------+-----------+----------+----------+----------+----------+-------------------------------------------------------------------------
1 | pg1 | primary | * running | | default | 100 | 1 | host=192.168.1.10 port=5432 user=repmgr dbname=repmgr connect_timeout=2
2 | pg2 | standby | running | pg1 | default | 100 | 1 | host=192.168.1.11 port=5432 user=repmgr dbname=repmgr connect_timeout=2
3 | pg3 | standby | running | pg1 | default | 100 | 1 | host=192.168.1.12 port=5432 user=repmgr dbname=repmgr connect_timeout=2
4 | pg4 | witness | * running | pg1 | default | 0 | n/a | host=192.168.1.13 port=5432 user=repmgr dbname=repmgr connect_timeout=2
(2)运行切换预检查
[pgsql@pg2:/home/pgsql]$repmgr -f /postgresql/app/postgresql/repmgr.conf standby switchover --siblings-follow --dry-run --force-rewindNOTICE: checking switchover on node "pg2" (ID: 2) in --dry-run mode
INFO: prerequisites for using pg_rewind are met
WARNING: unable to connect to remote host "192.168.1.10" via SSH
ERROR: unable to connect via SSH to host "192.168.1.10", user ""
无法切换成功。配置ssh;
需要重新在2号节点配置SSH免密登陆。
[pgsql@pg2:/home/pgsql]$repmgr -f /postgresql/app/postgresql/repmgr.conf standby switchover --siblings-follow --dry-run --force-rewindNOTICE: checking switchover on node "pg2" (ID: 2) in --dry-run mode
INFO: prerequisites for using pg_rewind are met
INFO: SSH connection to host "192.168.1.10" succeeded
INFO: able to execute "repmgr" on remote host "192.168.1.10"
INFO: all sibling nodes are reachable via SSH
INFO: 3 walsenders required, 10 available
INFO: demotion candidate is able to make replication connection to promotion candidate
WARNING: number of pending archive files on demotion candidate "pg1" exceeds the warning threshold
DETAIL: 23 pending archive files (warning threshold: 16)
HINT: PostgreSQL will not shut down until all files are archived
INFO: replication lag on this standby is 0 seconds
INFO: 3 replication slots required, 10 available
NOTICE: attempting to pause repmgrd on 4 nodes
NOTICE: local node "pg2" (ID: 2) would be promoted to primary; current primary "pg1" (ID: 1) would be demoted to standby
INFO: following shutdown command would be run on node "pg1":
"/postgresql/app/postgresql/bin/pg_ctl stop -w -D /postgresql/data"
INFO: parameter "shutdown_check_timeout" is set to 60 seconds
INFO: prerequisites for executing STANDBY SWITCHOVER are met
(3)执行正式的切换
[pgsql@pg2:/home/pgsql]$repmgr -f /postgresql/app/postgresql/repmgr.conf standby switchover --siblings-follow --force-rewind
NOTICE: executing switchover on node "pg2" (ID: 2)
WARNING: number of pending archive files on demotion candidate "pg1" exceeds the warning threshold
DETAIL: 24 pending archive files (warning threshold: 16)
HINT: PostgreSQL will not shut down until all files are archived
NOTICE: attempting to pause repmgrd on 4 nodes
NOTICE: local node "pg2" (ID: 2) will be promoted to primary; current primary "pg1" (ID: 1) will be demoted to standby
NOTICE: stopping current primary node "pg1" (ID: 1)
NOTICE: issuing CHECKPOINT on node "pg1" (ID: 1)
DETAIL: executing server command "/postgresql/app/postgresql/bin/pg_ctl stop -w -D /postgresql/data"
INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 0/55000028
NOTICE: promoting standby to primary
DETAIL: promoting server "pg2" (ID: 2) using "/postgresql/app/postgresql/bin/pg_ctl -w -D /postgresql/data promote"
waiting for server to promote.... done
server promoted
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server "pg2" (ID: 2) was successfully promoted to primary
NOTICE: issuing CHECKPOINT on node "pg2" (ID: 2)
ERROR: unable to execute CHECKPOINT
NOTICE: node "pg2" (ID: 2) promoted to primary, node "pg1" (ID: 1) demoted to standby
NOTICE: executing STANDBY FOLLOW on 2 of 2 siblings
INFO: STANDBY FOLLOW successfully executed on all reachable sibling nodes
NOTICE: switchover was successful
DETAIL: node "pg2" is now primary and node "pg1" is attached as standby
NOTICE: STANDBY SWITCHOVER has completed successfully
(4)检查主从
[pgsql@pg2:/home/pgsql]$repmgr -f /postgresql/app/postgresql/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+------+---------+-----------+----------+----------+----------+----------+-------------------------------------------------------------------------
1 | pg1 | standby | running | pg2 | default | 100 | 1 | host=192.168.1.10 port=5432 user=repmgr dbname=repmgr connect_timeout=2
2 | pg2 | primary | * running | | default | 100 | 2 | host=192.168.1.11 port=5432 user=repmgr dbname=repmgr connect_timeout=2
3 | pg3 | standby | running | pg2 | default | 100 | 1 | host=192.168.1.12 port=5432 user=repmgr dbname=repmgr connect_timeout=2
4 | pg4 | witness | * running | pg2 | default | 0 | n/a | host=192.168.1.13 port=5432 user=repmgr dbname=repmgr connect_timeout=2
现在从库已经变为主库。