前言
Redis sentinel主从切换通常有两种方式
1.通过哨兵节点sentinel failover mastername命令进行切换。
2.通过重启redis master实例进行切换。
环境:
10.1.1.100 10079 master
10.1.1.100 10080 slave
10.1.1.100 10081 slave
10.1.1.100 20079 sentinel1
10.1.1.100 20080 sentinel2
10.1.1.100 20081 sentinel3
主从切换:
将master由10.1.1.100 10079切换到其他redis 从实例上。
Redis sentinel主从切换一:
登录任一哨兵节点
执行切换操作
127.0.0.1:20079> sentinel failover mymaster
OK
故障转移,检查原主库10079日志:
master由10.1.1.100:10079切换到10.1.1.100:10081。
原master自动执行SLAVE OF 10.1.1.100:10081,并从新主库同步了需要的数据。
1620:M 01 Sep 14:29:19.902 # Connection with slave 10.1.1.100:10081 lost.
1620:M 01 Sep 14:29:20.846 # Connection with slave client id #9 lost.
1620:S 01 Sep 14:29:30.933 * Before turning into a slave, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
1620:S 01 Sep 14:29:30.933 * SLAVE OF 10.1.1.100:10081 enabled (user request from 'id=33 addr=10.1.1.100:43954 fd=11 name=sentinel-e50fd8d2-cmd age=10 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
1620:S 01 Sep 14:29:30.934 # CONFIG REWRITE executed with success.
1620:S 01 Sep 14:29:31.696 * Connecting to MASTER 10.1.1.100:10081
1620:S 01 Sep 14:29:31.696 * MASTER <-> SLAVE sync started
1620:S 01 Sep 14:29:31.696 * Non blocking connect for SYNC fired the event.
1620:S 01 Sep 14:29:31.697 * Master replied to PING, replication can continue...
1620:S 01 Sep 14:29:31.697 * Trying a partial resynchronization (request 9d20ee7ee943a42ebcd9756c0d952d065d2c0250:2765734).
1620:S 01 Sep 14:29:31.698 * Full resync from master: 033c7d4ab3f1e604e1a95486c9ee80410f8412bd:2766349
1620:S 01 Sep 14:29:31.698 * Discarding previously cached master state.
1620:S 01 Sep 14:29:31.706 * MASTER <-> SLAVE sync: receiving 209 bytes from master
1620:S 01 Sep 14:29:31.707 * MASTER <-> SLAVE sync: Flushing old data
1620:S 01 Sep 14:29:31.707 * MASTER <-> SLAVE sync: Loading DB in memory
1620:S 01 Sep 14:29:31.707 * MASTER <-> SLAVE sync: Finished with success
10081从库切换为主库日志如下:
启用master角色MASTER MODE enabled,依次收到了10080,10079的数据同步申请,本机执行了BGSAVE,并将数据分别传给从库。
1645:M 01 Sep 14:29:19.901 # Setting secondary replication ID to 9d20ee7ee943a42ebcd9756c0d952d065d2c0250, valid up to offset: 2763788. New replication ID is 033c7d4ab3f1e604e1a95486c9ee80410f8412bd
1645:M 01 Sep 14:29:19.901 # Connection with master lost.
1645:M 01 Sep 14:29:19.901 * Caching the disconnected master state.
1645:M 01 Sep 14:29:19.901 * Discarding previously cached master state.
1645:M 01 Sep 14:29:19.901 * MASTER MODE enabled (user request from 'id=4 addr=10.1.1.100:56744 fd=8 name=sentinel-900c1440-cmd age=13459 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
1645:M 01 Sep 14:29:19.902 # CONFIG REWRITE executed with success.
1645:M 01 Sep 14:29:21.288 * Slave 10.1.1.100:10080 asks for synchronization
1645:M 01 Sep 14:29:21.288 * Partial resynchronization not accepted: Requested offset for second ID was 2763927, but I can reply up to 2763788
1645:M 01 Sep 14:29:21.288 * Starting BGSAVE for SYNC with target: disk
1645:M 01 Sep 14:29:21.288 * Background saving started by pid 3562
3562:C 01 Sep 14:29:21.292 * DB saved on disk
3562:C 01 Sep 14:29:21.292 * RDB: 0 MB of memory used by copy-on-write
1645:M 01 Sep 14:29:21.380 * Background saving terminated with success
1645:M 01 Sep 14:29:21.381 * Synchronization with slave 10.1.1.100:10080 succeeded
1645:M 01 Sep 14:29:31.697 * Slave 10.1.1.100:10079 asks for synchronization
1645:M 01 Sep 14:29:31.697 * Partial resynchronization not accepted: Requested offset for second ID was 2765734, but I can reply up to 2763788
1645:M 01 Sep 14:29:31.698 * Starting BGSAVE for SYNC with target: disk
1645:M 01 Sep 14:29:31.698 * Background saving started by pid 3578
3578:C 01 Sep 14:29:31.701 * DB saved on disk
3578:C 01 Sep 14:29:31.702 * RDB: 0 MB of memory used by copy-on-write
1645:M 01 Sep 14:29:31.706 * Background saving terminated with success
1645:M 01 Sep 14:29:31.706 * Synchronization with slave 10.1.1.100:10079 succeeded
10080从库日志:
1005:S 01 Sep 14:29:20.846 # Connection with master lost.
1005:S 01 Sep 14:29:20.846 * Caching the disconnected master state.
1005:S 01 Sep 14:29:20.846 * SLAVE OF 10.1.1.100:10081 enabled (user request from 'id=4 addr=10.1.1.100:36399 fd=8 name=sentinel-900c1440-cmd age=13460 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=139 qbuf-free=32629 obl=36 oll=0 omem=0 events=r cmd=exec')
1005:S 01 Sep 14:29:20.847 # CONFIG REWRITE executed with success.
1005:S 01 Sep 14:29:21.286 * Connecting to MASTER 10.1.1.100:10081
1005:S 01 Sep 14:29:21.286 * MASTER <-> SLAVE sync started
1005:S 01 Sep 14:29:21.286 * Non blocking connect for SYNC fired the event.
1005:S 01 Sep 14:29:21.287 * Master replied to PING, replication can continue...
1005:S 01 Sep 14:29:21.287 * Trying a partial resynchronization (request 9d20ee7ee943a42ebcd9756c0d952d065d2c0250:2763927).
1005:S 01 Sep 14:29:21.288 * Full resync from master: 033c7d4ab3f1e604e1a95486c9ee80410f8412bd:2764227
1005:S 01 Sep 14:29:21.288 * Discarding previously cached master state.
1005:S 01 Sep 14:29:21.381 * MASTER <-> SLAVE sync: receiving 209 bytes from master
1005:S 01 Sep 14:29:21.381 * MASTER <-> SLAVE sync: Flushing old data
1005:S 01 Sep 14:29:21.381 * MASTER <-> SLAVE sync: Loading DB in memory
1005:S 01 Sep 14:29:21.381 * MASTER <-> SLAVE sync: Finished with success
如何将master切换到指定节点呢?
需要在切换前调整从库权重,slave-priority,将不需要成为master的从库权重slave-priority调为0。
redis@cjchdb-aaa-01:/redis/sentinel$cat 10079/redis.conf |grep slave-priority
slave-priority 100
redis@cjchdb-aaa-01:/redis/sentinel$cat 10080/redis.conf |grep slave-priority
slave-priority 100
redis@cjchdb-aaa-01:/redis/sentinel$cat 10081/redis.conf |grep slave-priority
slave-priority 100
检查并修改权重
调低10080权重值,此时10079权重高于10080
127.0.0.1:10080> config get slave-priority
1) "slave-priority"
2) "100"
127.0.0.1:10080> config set slave-priority 0
OK
127.0.0.1:10080> config get slave-priority
1) "slave-priority"
2) "0"
配置文件权重值并没有自动修改
redis@cjchdb-aaa-01:/redis/sentinel$cat 10080/redis.conf |grep slave-priority
slave-priority 100
主从切换
127.0.0.1:20081> sentinel failover mymaster
OK
master成功切换到10079上
127.0.0.1:20081> info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.1.1.100:10079,slaves=2,sentinels=3
主从切换日志
1645:M 01 Sep 14:44:58.591 # Connection with slave 10.1.1.100:10079 lost.
1645:M 01 Sep 14:44:59.479 # Connection with slave client id #18 lost.
1645:S 01 Sep 14:45:09.556 * Before turning into a slave, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
1645:S 01 Sep 14:45:09.556 * SLAVE OF 10.1.1.100:10079 enabled (user request from 'id=26 addr=10.1.1.100:52265 fd=10 name=sentinel-8b0a2ce1-cmd age=10 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
1645:S 01 Sep 14:45:09.556 # CONFIG REWRITE executed with success.
1645:S 01 Sep 14:45:09.792 * Connecting to MASTER 10.1.1.100:10079
1645:S 01 Sep 14:45:09.792 * MASTER <-> SLAVE sync started
1645:S 01 Sep 14:45:09.792 * Non blocking connect for SYNC fired the event.
1645:S 01 Sep 14:45:09.792 * Master replied to PING, replication can continue...
1645:S 01 Sep 14:45:09.793 * Trying a partial resynchronization (request 033c7d4ab3f1e604e1a95486c9ee80410f8412bd:2959217).
1645:S 01 Sep 14:45:09.794 * Full resync from master: c8ed4fe97f766dfbb8cf7b5a11066e8fcf2bf844:2959670
1645:S 01 Sep 14:45:09.794 * Discarding previously cached master state.
1645:S 01 Sep 14:45:09.870 * MASTER <-> SLAVE sync: receiving 209 bytes from master
1645:S 01 Sep 14:45:09.871 * MASTER <-> SLAVE sync: Flushing old data
1645:S 01 Sep 14:45:09.871 * MASTER <-> SLAVE sync: Loading DB in memory
1645:S 01 Sep 14:45:09.871 * MASTER <-> SLAVE sync: Finished with success
改回原权重
127.0.0.1:10080> config set slave-priority 100
OK
Redis sentinel主从切换方式二:
Redis sentinel主从切换,除了使用sentinel failover命令外,也可以通过重启master redis实例来完成。
重启主库
127.0.0.1:10079> shutdown
not connected>
查看原主库日志:
1620:M 01 Sep 14:49:11.478 # User requested shutdown...
1620:M 01 Sep 14:49:11.478 * Saving the final RDB snapshot before exiting.
1620:M 01 Sep 14:49:11.487 * DB saved on disk
1620:M 01 Sep 14:49:11.487 * Removing the pid file.
1620:M 01 Sep 14:49:11.487 # Redis is now ready to exit, bye bye...
查看新主库信息
127.0.0.1:20080> info Sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.1.1.100:10081,slaves=2,sentinels=3
查看新主库10081日志:
多次尝试连接原主库10079均失败,启用MASTER MODE enabled,并将数据同步到从库。
1645:S 01 Sep 14:49:11.488 # Connection with master lost.
1645:S 01 Sep 14:49:11.488 * Caching the disconnected master state.
1645:S 01 Sep 14:49:12.212 * Connecting to MASTER 10.1.1.100:10079
1645:S 01 Sep 14:49:12.212 * MASTER <-> SLAVE sync started
1645:S 01 Sep 14:49:12.212 # Error condition on socket for SYNC: Connection refused
1645:S 01 Sep 14:49:13.214 * Connecting to MASTER 10.1.1.100:10079
1645:S 01 Sep 14:49:13.214 * MASTER <-> SLAVE sync started
1645:S 01 Sep 14:49:13.215 # Error condition on socket for SYNC: Connection refused
1645:S 01 Sep 14:49:14.217 * Connecting to MASTER 10.1.1.100:10079
1645:S 01 Sep 14:49:14.217 * MASTER <-> SLAVE sync started
1645:S 01 Sep 14:49:14.217 # Error condition on socket for SYNC: Connection refused
1645:S 01 Sep 14:49:15.218 * Connecting to MASTER 10.1.1.100:10079
1645:S 01 Sep 14:49:15.218 * MASTER <-> SLAVE sync started
1645:S 01 Sep 14:49:15.219 # Error condition on socket for SYNC: Connection refused
1645:S 01 Sep 14:49:16.221 * Connecting to MASTER 10.1.1.100:10079
1645:S 01 Sep 14:49:16.221 * MASTER <-> SLAVE sync started
1645:S 01 Sep 14:49:16.222 # Error condition on socket for SYNC: Connection refused
1645:S 01 Sep 14:49:17.224 * Connecting to MASTER 10.1.1.100:10079
1645:S 01 Sep 14:49:17.224 * MASTER <-> SLAVE sync started
1645:S 01 Sep 14:49:17.224 # Error condition on socket for SYNC: Connection refused
1645:S 01 Sep 14:49:18.226 * Connecting to MASTER 10.1.1.100:10079
1645:S 01 Sep 14:49:18.226 * MASTER <-> SLAVE sync started
1645:S 01 Sep 14:49:18.226 # Error condition on socket for SYNC: Connection refused
1645:S 01 Sep 14:49:19.229 * Connecting to MASTER 10.1.1.100:10079
1645:S 01 Sep 14:49:19.230 * MASTER <-> SLAVE sync started
1645:S 01 Sep 14:49:19.230 # Error condition on socket for SYNC: Connection refused
1645:M 01 Sep 14:49:19.894 # Setting secondary replication ID to c8ed4fe97f766dfbb8cf7b5a11066e8fcf2bf844, valid up to offset: 3009514. New replication ID is fddd686f75809723cc89e296b3daeee549e066d7
1645:M 01 Sep 14:49:19.894 * Discarding previously cached master state.
1645:M 01 Sep 14:49:19.894 * MASTER MODE enabled (user request from 'id=30 addr=10.1.1.100:45093 fd=7 name=sentinel-900c1440-cmd age=250 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
1645:M 01 Sep 14:49:19.895 # CONFIG REWRITE executed with success.
1645:M 01 Sep 14:49:21.435 * Slave 10.1.1.100:10080 asks for synchronization
1645:M 01 Sep 14:49:21.435 * Partial resynchronization request from 10.1.1.100:10080 accepted. Sending 440 bytes of backlog starting from offset 3009514.
从库10080日志:
1005:S 01 Sep 14:49:11.488 # Connection with master lost.
1005:S 01 Sep 14:49:11.488 * Caching the disconnected master state.
1005:S 01 Sep 14:49:12.413 * Connecting to MASTER 10.1.1.100:10079
1005:S 01 Sep 14:49:12.414 * MASTER <-> SLAVE sync started
1005:S 01 Sep 14:49:12.414 # Error condition on socket for SYNC: Connection refused
1005:S 01 Sep 14:49:13.416 * Connecting to MASTER 10.1.1.100:10079
1005:S 01 Sep 14:49:13.416 * MASTER <-> SLAVE sync started
1005:S 01 Sep 14:49:13.417 # Error condition on socket for SYNC: Connection refused
1005:S 01 Sep 14:49:14.417 * Connecting to MASTER 10.1.1.100:10079
1005:S 01 Sep 14:49:14.417 * MASTER <-> SLAVE sync started
1005:S 01 Sep 14:49:14.417 # Error condition on socket for SYNC: Connection refused
1005:S 01 Sep 14:49:15.420 * Connecting to MASTER 10.1.1.100:10079
1005:S 01 Sep 14:49:15.420 * MASTER <-> SLAVE sync started
1005:S 01 Sep 14:49:15.420 # Error condition on socket for SYNC: Connection refused
1005:S 01 Sep 14:49:16.422 * Connecting to MASTER 10.1.1.100:10079
1005:S 01 Sep 14:49:16.422 * MASTER <-> SLAVE sync started
1005:S 01 Sep 14:49:16.423 # Error condition on socket for SYNC: Connection refused
1005:S 01 Sep 14:49:17.425 * Connecting to MASTER 10.1.1.100:10079
1005:S 01 Sep 14:49:17.425 * MASTER <-> SLAVE sync started
1005:S 01 Sep 14:49:17.425 # Error condition on socket for SYNC: Connection refused
1005:S 01 Sep 14:49:18.426 * Connecting to MASTER 10.1.1.100:10079
1005:S 01 Sep 14:49:18.426 * MASTER <-> SLAVE sync started
1005:S 01 Sep 14:49:18.426 # Error condition on socket for SYNC: Connection refused
1005:S 01 Sep 14:49:19.429 * Connecting to MASTER 10.1.1.100:10079
1005:S 01 Sep 14:49:19.429 * MASTER <-> SLAVE sync started
1005:S 01 Sep 14:49:19.429 # Error condition on socket for SYNC: Connection refused
1005:S 01 Sep 14:49:20.432 * Connecting to MASTER 10.1.1.100:10079
1005:S 01 Sep 14:49:20.432 * MASTER <-> SLAVE sync started
1005:S 01 Sep 14:49:20.432 # Error condition on socket for SYNC: Connection refused
1005:S 01 Sep 14:49:20.791 * SLAVE OF 10.1.1.100:10081 enabled (user request from 'id=22 addr=10.1.1.100:59965 fd=7 name=sentinel-900c1440-cmd age=261 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=139 qbuf-free=32629 obl=36 oll=0 omem=0 events=r cmd=exec')
1005:S 01 Sep 14:49:20.791 # CONFIG REWRITE executed with success.
1005:S 01 Sep 14:49:21.434 * Connecting to MASTER 10.1.1.100:10081
1005:S 01 Sep 14:49:21.434 * MASTER <-> SLAVE sync started
1005:S 01 Sep 14:49:21.434 * Non blocking connect for SYNC fired the event.
1005:S 01 Sep 14:49:21.435 * Master replied to PING, replication can continue...
1005:S 01 Sep 14:49:21.435 * Trying a partial resynchronization (request c8ed4fe97f766dfbb8cf7b5a11066e8fcf2bf844:3009514).
1005:S 01 Sep 14:49:21.435 * Successful partial resynchronization with master.
1005:S 01 Sep 14:49:21.435 # Master replication ID changed to fddd686f75809723cc89e296b3daeee549e066d7
1005:S 01 Sep 14:49:21.436 * MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization.
sentinel其他常用命令:
1.sentinel节点启动两种方式
redis-sentinel redis-sentinel-20079.conf
或
redis-server redis-sentinel-20079.conf --sentinel
2.显示被哨兵管理的主库信息
sentinel masters
sentinel masters mastername
从库信息
sentinel slaves mastername
3.清除主节点相关状态,重新发现从节点和Sentinel节点
sentinel reset mastername
4.将Sentinel节点配置强制刷新到磁盘
当磁盘损坏等原因导致配置文件损坏或丢失时,可以使用这个命令
sentinel flushconfig
欢迎关注我的公众号《IT小Chen》