Redis 高可用部署，看看这篇

转载于 2019-07-25 18:00:00 发布 · 148 阅读

1 ·

CC 4.0 BY-SA版权

原文链接：https://www.xrcloud.net/activity/free/

本文详细介绍了在AWS EC2上部署Redis主从+哨兵漂VIP的高可用方案，通过实现主从复制、哨兵监控及VIP漂移，确保了Redis服务的稳定性和可靠性。

640?wx_fmt=gif

一、背景

公司的业务在大量的使用redis，访问量大的业务我们有在使用codis集群，redis 3.0集群，说到redis 3.0集群，我们线上已经跑了半年多了，集群本身没有出现过任务问题，但是由于我们这个业务是海外的，集群建在aws的ec2上，由于ec2的网络抖动或者ec2本身的原因，导致主从切换，目前aws的技术正在跟进，这个集群目前的QPS 50w+，集群本身已经做到了高可用和横向扩展。

但是，实际情况一些小的业务没必要上集群，单个实例就可以满足业务需求，那么我们就要想办法如何保证单个实例的高可用，最近也在看相关的文档，做一些测试。

大家有在使用redis主从+lvs 漂VIP的方案，也有使用redis主从+哨兵漂VIP的方案，甚至有在代码逻辑做故障切换等等，各种各样的方案都有，下面我介绍一下redis主从+哨兵漂VIP的方案，后面我们打算线上大规模的使用这个方案。

二、环境

#redis100.10.32.54:6400 主库100.10.32.55:6400 从库100.10.32.250 VIP#sentinel100.10.32.54:26400 sentinel 本地节点100.10.32.55:26400 sentinel 本地节点100.10.32.57:26400 sentinel 仲裁节点

三、部署

1、安装

yum -y install redis

2、撰写redis配置文件（100.10.32.54 和100.10.32.55）

vim /etc/redis_6400.confdaemonize yespidfile "/var/run/redis_6400.pid"port 6400tcp-backlog 65535bind 0.0.0.0timeout 0tcp-keepalive 0loglevel noticelogfile "/var/log/redis/redis_6400.log"maxmemory 8gbmaxmemory-policy allkeys-lrudatabases 16save 900 1save 300 10save 60 10000stop-writes-on-bgsave-error yesrdbcompression yesrdbchecksum yesdbfilename "dump.rdb"dir "/data/redis/6400"slave-serve-stale-data yesslave-read-only yesrepl-disable-tcp-nodelay noslave-priority 100appendonly noappendfilename "appendonly.aof"appendfsync everysecno-appendfsync-on-rewrite noauto-aof-rewrite-percentage 100auto-aof-rewrite-min-size 64mblua-time-limit 5000slowlog-log-slower-than 10000slowlog-max-len 128notify-keyspace-events ""hash-max-ziplist-entries 512hash-max-ziplist-value 64list-max-ziplist-entries 512list-max-ziplist-value 64set-max-intset-entries 512zset-max-ziplist-entries 128

3、撰写sentinel配置文件

（100.10.32.54 、100.10.32.55 和100.10.32.57）

vim /etc/redis-sentinel6400.confdaemonize yesport 26400dir "/data/redis/redis_sentinels"pidfile "/var/run/redis/sentinel6400.pid"logfile "/data/redis/redis_sentinels/sentinel6400.log"sentinel monitor master6400 100.10.32.54 6400 2sentinel down-after-milliseconds master6400 6000sentinel failover-timeout master6400 18000sentinel client-reconfig-script master6400 /opt/notify_master6400.sh   ##仲裁节点无需添加这行配置，client-reconfig-script参数是在sentinel做failover的过程中调用脚本漂vip到新的master上

关于sentinel 的一些工作原理和参数说明，请参阅：http://redisdoc.com/topic/sentinel.html

4、撰写漂VIP的脚本（100.10.32.54 、100.10.32.55）

vim /opt/notify_master6400.sh#!/bin/bashMASTER_IP=$6LOCAL_IP='100.10.32.54' #从库修改为100.10.32.55VIP='100.10.32.250'NETMASK='24'INTERFACE='eth0'if [ ${MASTER_IP} = ${LOCAL_IP} ]; then         /sbin/ip addr add ${VIP}/${NETMASK} dev ${INTERFACE}         /sbin/arping -q -c 3 -A ${VIP} -I ${INTERFACE}        exit 0else         /sbin/ip addr del ${VIP}/${NETMASK} dev ${INTERFACE}        exit 0fiexit 1

chmod +x /opt/notify_master6400.sh   #赋予可执行权限

这里大概说一下这个脚本的工作原理，sentinel在做failover的过程中会传出6个参数，分别是，其中第6个参数from-ip也就是新的master的ip，对应脚本中的MASTER_IP，下面的if判断大家应该都很了然了，如果MASTER_IP=LOCAL_IP，那就绑定VIP，反之删除VIP。

5、启动redis服务（100.10.32.54、100.10.32.55）

redis-server /etc/redis_6400.conf

6、初始化主从（100.10.32.55）

redis-cli -p 6400 slaveof 10.10.32.54 6400

7、绑定VIP到主库（100.10.32.54）

/sbin/ip addr add 100.10.32.250/24 dev eth0

8、启动sentinel服务（100.10.32.54、100.10.32.55、100.10.32.57）

redis-server /etc/redis-sentinel6400.conf --sentinel

至此，整个高可用方案已经搭建完成。

[root@localhost tmp]# redis-cli -h 100.10.32.54  -p 6400 info  Replication# Replicationrole:masterconnected_slaves:1slave0:ip=100.10.32.55,port=6400,state=online,offset=72669,lag=1master_repl_offset:72669repl_backlog_active:1repl_backlog_size:1048576repl_backlog_first_byte_offset:2repl_backlog_histlen:72668

[root@localhost tmp]# redis-cli -h 100.10.32.54  -p 26400 info Sentinel# Sentinelsentinel_masters:1sentinel_tilt:0sentinel_running_scripts:0sentinel_scripts_queue_length:0master0:name=master6400,status=ok,address=100.10.32.54:6400,slaves=1,sentinels=3

[root@localhost tmp]# ip a |grep eth02: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000    inet 100.10.32.54/24 brd 100.10.32.255 scope global eth0    inet 100.10.32.250/24 scope global secondary eth0

四、测试

1、把主库停掉

redis-cli -h 100.10.32.54  -p 6400 shutdown

2、看从库是否提升为主库

[root@localhost tmp]# redis-cli -h 100.10.32.55 -p 6400 info Replication# Replicationrole:masterconnected_slaves:0master_repl_offset:0repl_backlog_active:0repl_backlog_size:1048576repl_backlog_first_byte_offset:0repl_backlog_histlen:0

3、看VIP是否漂移到100.10.32.55上

[root@localhost tmp]# ip a |grep eth02: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000    inet 100.10.32.55/24 brd 100.10.32.255 scope global eth0    inet 100.10.32.250/24 scope global secondary eth0

4、看Sentinel的监控状态

[root@localhost tmp]# redis-cli -p 26400 info  Sentinel# Sentinelsentinel_masters:1sentinel_tilt:0sentinel_running_scripts:0sentinel_scripts_queue_length:0master0:name=master6400,status=ok,address=100.10.32.55:6400,slaves=1,sentinels=3

来源：https://blog.51cto.com/navyaijm/1745569

长按二维码，关注我们