今天一个生产环境的redis4.0 Sentinel切换之后错乱了,2个redis, 一个主,一个从,3台sentinel。 所谓切换错乱是说,主从切换之后,在sentinel查询的信息乱了,明明是主,但是显示从,明明是从还是从,然后2台redis都在read-only状态, 单独查询redis本身的状态是正常的,目前原因尚未可知,莫非是sentinel的bug ?
我自己测试环境的redis sentinel,2个redis端口,3个sentinel端口:
主配置:
bind 10.10.192.88
port 6379
tcp-backlog 511
unixsocket "/tmp/redis.sock"
unixsocketperm 700
timeout 0
tcp-keepalive 300
daemonize yes
supervised no
pidfile "/var/run/redis_6379.pid"
loglevel notice
logfile "/var/log/6379.log"
syslog-enabled yes
# syslog-ident redis
# syslog-facility local0
databases 16
always-show-logo yes
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename "dump.rdb"
dir "/data/redis-4.0.9"
#
# slaveof <masterip> <masterport>
# masterauth <master-password>
slave-serve-stale-data yes
slave-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
# repl-ping-slave-period 10
# repl-timeout 60
repl-disable-tcp-nodelay no
# repl-backlog-size 1mb
# repl-backlog-ttl 3600
slave-priority 100
# min-slaves-to-write 3
# min-slaves-max-lag 10
# slave-announce-ip 5.5.5.5
# slave-announce-port 1234
# requirepass foobared
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
repl-backlog-size 1mb
repl-backlog-ttl 3600
maxclients 10000
maxmemory 1953125kb
maxmemory-policy allkeys-lfu
# maxmemory-samples 5
lazyfree-lazy-eviction yes
lazyfree-lazy-expire yes
lazyfree-lazy-server-del yes
slave-lazy-flush yes
appendonly yes
appendfilename "6379.aof"
# appendfsync always
appendfsync everysec
# appendfsync no
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
aof-use-rdb-preamble no
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
#hash-max-zipmap-entries 512
#hash-max-zipmap-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
set-max-intset-entries 512
# Generated by CONFIG REWRITE从配置类似,sentinel的配置:
bind 127.0.0.1 10.10.192.88
daemonize yes
# protected-mode no
port 26379
loglevel notice
logfile "/var/log/26379.log"
# sentinel announce-ip <ip>
# sentinel announce-port <port>
# sentinel announce-ip 1.2.3.4
dir "/tmp"
sentinel monitor mymaster 10.10.192.88 6379 2
sentinel down-after-milliseconds mymaster 26379
sentinel failover-timeout mymaster 60000
sentinel parallel-syncs mymaster 1然后启动redis服务和sentinel服务即可。查询sentinel信息:
10.10.192.88:26379> SENTINEL master mymaster
1) "name"
2) "mymaster"
3) "ip"
4) "10.10.192.88"
5) "port"
6) "6379"
7) "runid"
8) "58feac31464155a5ccb6cbbf3c8aa83b7fa53d0a"
9) "flags"
10) "master"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "31"
19) "last-ping-reply"
20) "31"
21) "down-after-milliseconds"
22) "26379"
23) "info-refresh"
24) "8950"
25) "role-reported"
26) "master"
27) "role-reported-time"
28) "741477"
29) "config-epoch"
30) "1"
31) "num-slaves"
32) "1"
33) "num-other-sentinels"
34) "2"
35) "quorum"
36) "2"
37) "failover-timeout"
38) "60000"
39) "parallel-syncs"
40) "110.10.192.88:26379> SENTINEL slaves mymaster
1) 1) "name"
2) "10.10.192.88:6378"
3) "ip"
4) "10.10.192.88"
5) "port"
6) "6378"
7) "runid"
8) "9b6b6bf743f247077b7187da9603f78dc56f1b01"
9) "flags"
10) "slave"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "871"
19) "last-ping-reply"
20) "871"
21) "down-after-milliseconds"
22) "26379"
23) "info-refresh"
24) "7569"
25) "role-reported"
26) "slave"
27) "role-reported-time"
28) "771286"
29) "master-link-down-time"
30) "0"
31) "master-link-status"
32) "ok"
33) "master-host"
34) "10.10.192.88"
35) "master-port"
36) "6379"
37) "slave-priority"
38) "100"
39) "slave-repl-offset"
40) "166151"JAVA连接sentinel:
package com.isesol.elasticsearch;
import redis.clients.jedis.HostAndPort;
import redis.clients.jedis.Jedis;
import redis.clients.jedis.JedisPoolConfig;
import redis.clients.jedis.JedisSentinelPool;
import java.util.HashSet;
import java.util.Set;
public class test {
public static void main(String args[]) {
JedisPoolConfig poolConfig = new JedisPoolConfig();
String masterName = "mymaster";
Set<String> sentinels = new HashSet<String>();
sentinels.add("10.10.192.88:26379");
sentinels.add("10.10.192.88:26380");
sentinels.add("10.10.192.88:26381");
JedisSentinelPool jedisSentinelPool = new JedisSentinelPool(masterName, sentinels, poolConfig);
HostAndPort currentHostMaster = jedisSentinelPool.getCurrentHostMaster();
System.out.println(currentHostMaster.getHost()+"--"+currentHostMaster.getPort());
Jedis resource = jedisSentinelPool.getResource();
String value = resource.get("wangjialong");
System.out.println(value);
resource.close();
}
}看到有人添加keepalived,这样就不需要去用sentinel连接,这么做也可以,但是实际上引入了新问题,那就是keepalived的脑裂问题,虽然很少发生,实际我不建议这么做。
至于sentinel本身提供的脚本切换功能来建立VIP,也可以,我思前想后,使用sentinel的连接有什么问题吗?非要去尝试避免sentinel来自己弄个VIP?
本文记录了一次生产环境中Redis 4.0版本Sentinel主从切换后出现的异常情况,包括信息错乱和两台Redis服务器均进入只读状态等问题。文中详细展示了Redis与Sentinel的配置文件,并通过查询Sentinel信息进行故障定位。
1942

被折叠的 条评论
为什么被折叠?



