(1)领着选举过程是集群中所有master参与,如果半数以上master节点与master节点通信超过(cluster-node-timeout),认为当前master节点挂掉.
(2)什么时候整个集群不可用(cluster_state:fail),当集群不可用时,所有对集群的操作做都不可用,收到((error) CLUSTERDOWN The cluster is down)错误
a:如果集群任意master挂掉,且当前master没有slave.集群进入fail状态.
b:如果进群超过半数以上master挂掉,无论是否有slave集群进入fail状态.
1
2
3
4
5
6
7
8
9
10
11
12
13
|
redis@slave01:~$ ls redis7003 redis7004 redis7005 redis7006 redis@slave01:~$ tree redis7006 redis7006 ├── bin │ ├── redis-benchmark │ ├── redis-check-aof │ ├── redis-check-dump │ ├── redis-cli │ ├── redis-sentinel │ ├── redis-server │ └── redis-trib.rb └── redis.conf |
1
2
|
redis@slave01:~$ cd redis7006/ redis@slave01:~ /redis7006 $ bin /redis-server redis.conf |
1
|
. /redis-trib .rb add-node IP:PORT IP:PORT |
1
|
. /redis-trib .rb add-node --slave IP:PORT IP:PORT |
1
2
3
|
. /redis-trib .rb add-node --slave --master- id ID IP:PORT IP:PORT |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
|
redis@master:~ /redis7000 $ bin /redis-trib .rb add-node 192.168.207.130:7006 192.168.207.128:7002 >>> Adding node 192.168.207.130:7006 to cluster 192.168.207.128:7002 Connecting to node 192.168.207.128:7002: OK Connecting to node 192.168.207.130:7003: OK Connecting to node 192.168.207.130:7004: OK Connecting to node 192.168.207.128:7001: OK Connecting to node 192.168.207.128:7000: OK Connecting to node 192.168.207.130:7005: OK >>> Performing Cluster Check (using node 192.168.207.128:7002) S: 0becd52cb1fa1a69f8d3135dd98f87cd8ccf9d78 192.168.207.128:7002 slots: (0 slots) slave replicates 8490b5e84bf0359871ea3fa55b97bb9877be0512 M: 8490b5e84bf0359871ea3fa55b97bb9877be0512 192.168.207.130:7003 slots:5461-10922 (5462 slots) master 1 additional replica(s) M: 9d076e70871fc7291485aba97b2623dc9fb3b3b0 192.168.207.130:7004 slots:0-5460 (5461 slots) master 1 additional replica(s) M: 568669e03c61b3c4edc31643dcb47e85bc2c3e23 192.168.207.128:7001 slots:10923-16383 (5461 slots) master 1 additional replica(s) S: 7dae923773125c5956605fac6159412850b100b3 192.168.207.128:7000 slots: (0 slots) slave replicates 9d076e70871fc7291485aba97b2623dc9fb3b3b0 S: f3ba7c62307e0321f5d21310d14027c403c73907 192.168.207.130:7005 slots: (0 slots) slave replicates 568669e03c61b3c4edc31643dcb47e85bc2c3e23 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. Connecting to node 192.168.207.130:7006: OK >>> Send CLUSTER MEET to node 192.168.207.130:7006 to make it join the cluster. [OK] New node added correctly. |
1
2
3
4
5
6
7
8
|
192.168.207.130:7003> cluster nodes b502b7efac704e92ac439933a42edf613f908fea 192.168.207.130:7006 master - 0 1444270807484 0 connected f3ba7c62307e0321f5d21310d14027c403c73907 192.168.207.130:7005 slave 568669e03c61b3c4edc31643dcb47e85bc2c3e23 0 1444270805452 6 connected 7dae923773125c5956605fac6159412850b100b3 192.168.207.128:7000 slave 9d076e70871fc7291485aba97b2623dc9fb3b3b0 0 1444270806468 7 connected 0becd52cb1fa1a69f8d3135dd98f87cd8ccf9d78 192.168.207.128:7002 slave 8490b5e84bf0359871ea3fa55b97bb9877be0512 0 1444270804437 4 connected 568669e03c61b3c4edc31643dcb47e85bc2c3e23 192.168.207.128:7001 master - 0 1444270804437 2 connected 10923-16383 8490b5e84bf0359871ea3fa55b97bb9877be0512 192.168.207.130:7003 myself,master - 0 0 4 connected 5461-10922 9d076e70871fc7291485aba97b2623dc9fb3b3b0 192.168.207.130:7004 master - 0 1444270804437 7 connected 0-5460 |
1
|
. /redis-trib .rb reshard IP:PORT |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
|
redis@master:~ /redis7000 $ bin /redis-trib .rb reshard 192.168.207.128:7000 Connecting to node 192.168.207.128:7000: OK Connecting to node 192.168.207.130:7004: OK Connecting to node 192.168.207.130:7003: OK Connecting to node 192.168.207.128:7002: OK Connecting to node 192.168.207.128:7001: OK Connecting to node 192.168.207.130:7006: OK Connecting to node 192.168.207.130:7005: OK >>> Performing Cluster Check (using node 192.168.207.128:7000) S: 7dae923773125c5956605fac6159412850b100b3 192.168.207.128:7000 slots: (0 slots) slave replicates 9d076e70871fc7291485aba97b2623dc9fb3b3b0 M: 9d076e70871fc7291485aba97b2623dc9fb3b3b0 192.168.207.130:7004 slots:0-5460 (5461 slots) master 1 additional replica(s) M: 8490b5e84bf0359871ea3fa55b97bb9877be0512 192.168.207.130:7003 slots:5461-10922 (5462 slots) master 1 additional replica(s) S: 0becd52cb1fa1a69f8d3135dd98f87cd8ccf9d78 192.168.207.128:7002 slots: (0 slots) slave replicates 8490b5e84bf0359871ea3fa55b97bb9877be0512 M: 568669e03c61b3c4edc31643dcb47e85bc2c3e23 192.168.207.128:7001 slots:10923-16383 (5461 slots) master 1 additional replica(s) M: b502b7efac704e92ac439933a42edf613f908fea 192.168.207.130:7006 slots: (0 slots) master 0 additional replica(s) S: f3ba7c62307e0321f5d21310d14027c403c73907 192.168.207.130:7005 slots: (0 slots) slave replicates 568669e03c61b3c4edc31643dcb47e85bc2c3e23 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. How many slots do you want to move (from 1 to 16384)? 2000 #输入2000 What is the receiving node ID? b502b7efac704e92ac439933a42edf613f908fea #输入哪个节点想接收这些slot,这里就是7006的ID Please enter all the source node IDs. Type 'all' to use all the nodes as source nodes for the hash slots. Type 'done' once you entered all the source nodes IDs. Source node #1:9d076e70871fc7291485aba97b2623dc9fb3b3b0 #输入想从哪个节点移动slot,我这里输入了7004的ID Source node #2:done #输入done回车后会输出具体会移动的slot Moving slot 0 from 9d076e70871fc7291485aba97b2623dc9fb3b3b0 .........略........ Moving slot 1995 from 9d076e70871fc7291485aba97b2623dc9fb3b3b0 Moving slot 1996 from 9d076e70871fc7291485aba97b2623dc9fb3b3b0 Moving slot 1997 from 9d076e70871fc7291485aba97b2623dc9fb3b3b0 Moving slot 1998 from 9d076e70871fc7291485aba97b2623dc9fb3b3b0 Moving slot 1999 from 9d076e70871fc7291485aba97b2623dc9fb3b3b0 Do you want to proceed with the proposed reshard plan ( yes /no )? yes #执行计划,开始真正移动slot Moving slot 0 from 192.168.207.130:7004 to 192.168.207.130:7006: .....略..... Moving slot 1993 from 192.168.207.130:7004 to 192.168.207.130:7006: Moving slot 1994 from 192.168.207.130:7004 to 192.168.207.130:7006: Moving slot 1995 from 192.168.207.130:7004 to 192.168.207.130:7006: Moving slot 1996 from 192.168.207.130:7004 to 192.168.207.130:7006: Moving slot 1997 from 192.168.207.130:7004 to 192.168.207.130:7006: Moving slot 1998 from 192.168.207.130:7004 to 192.168.207.130:7006: Moving slot 1999 from 192.168.207.130:7004 to 192.168.207.130:7006: redis@master:~ /redis7000 $ |
1
2
3
4
5
6
7
8
|
192.168.207.130:7003> cluster nodes b502b7efac704e92ac439933a42edf613f908fea 192.168.207.130:7006 master - 0 1444273978574 8 connected 0-1999 #已有2000个slot f3ba7c62307e0321f5d21310d14027c403c73907 192.168.207.130:7005 slave 568669e03c61b3c4edc31643dcb47e85bc2c3e23 0 1444273977559 6 connected 7dae923773125c5956605fac6159412850b100b3 192.168.207.128:7000 slave 9d076e70871fc7291485aba97b2623dc9fb3b3b0 0 1444273975530 7 connected 0becd52cb1fa1a69f8d3135dd98f87cd8ccf9d78 192.168.207.128:7002 slave 8490b5e84bf0359871ea3fa55b97bb9877be0512 0 1444273977559 4 connected 568669e03c61b3c4edc31643dcb47e85bc2c3e23 192.168.207.128:7001 master - 0 1444273975530 2 connected 10923-16383 8490b5e84bf0359871ea3fa55b97bb9877be0512 192.168.207.130:7003 myself,master - 0 0 4 connected 5461-10922 9d076e70871fc7291485aba97b2623dc9fb3b3b0 192.168.207.130:7004 master - 0 1444273976544 7 connected 2000-5460 #少了2000个solt |
1
|
. /redis-trib .rb del-node IP:PORT 'node-id' |
1
2
3
4
5
6
7
8
9
10
|
redis@master:~ /redis7000 $ bin /redis-trib .rb del-node 192.168.207.128:7000 '9d076e70871fc7291485aba97b2623dc9fb3b3b0' >>> Removing node 9d076e70871fc7291485aba97b2623dc9fb3b3b0 from cluster 192.168.207.128:7000 Connecting to node 192.168.207.128:7000: OK Connecting to node 192.168.207.130:7004: OK Connecting to node 192.168.207.130:7003: OK Connecting to node 192.168.207.128:7002: OK Connecting to node 192.168.207.128:7001: OK Connecting to node 192.168.207.130:7006: OK Connecting to node 192.168.207.130:7005: OK [ERR] Node 192.168.207.130:7004 is not empty! Reshard data away and try again. |
1
|
redis@master:~ /redis7000 $ bin /redis-trib .rb reshard --from 9d076e70871fc7291485aba97b2623dc9fb3b3b0 --to b502b7efac704e92ac439933a42edf613f908fea --slots 3461 -- yes 192.168.207.128:7000 |
1
2
3
4
5
6
7
8
9
|
192.168.207.130:7004> cluster nodes f3ba7c62307e0321f5d21310d14027c403c73907 192.168.207.130:7005 slave 568669e03c61b3c4edc31643dcb47e85bc2c3e23 0 1444283310797 6 connected 568669e03c61b3c4edc31643dcb47e85bc2c3e23 192.168.207.128:7001 master - 0 1444283307772 2 connected 10923-16383 8490b5e84bf0359871ea3fa55b97bb9877be0512 192.168.207.130:7003 master - 0 1444283306763 4 connected 5461-10922 0becd52cb1fa1a69f8d3135dd98f87cd8ccf9d78 192.168.207.128:7002 slave 8490b5e84bf0359871ea3fa55b97bb9877be0512 0 1444283308780 4 connected 9d076e70871fc7291485aba97b2623dc9fb3b3b0 192.168.207.130:7004 myself,master - 0 0 7 connected 7dae923773125c5956605fac6159412850b100b3 192.168.207.128:7000 slave 9d076e70871fc7291485aba97b2623dc9fb3b3b0 0 1444283308780 7 connected b502b7efac704e92ac439933a42edf613f908fea 192.168.207.130:7006 master - 0 1444283309788 8 connected 0-5460 |
1
2
3
4
5
6
7
8
9
10
11
12
|
redis@master:~ /redis7000 $ bin /redis-trib .rb del-node 192.168.207.128:7000 '9d076e70871fc7291485aba97b2623dc9fb3b3b0' >>> Removing node 9d076e70871fc7291485aba97b2623dc9fb3b3b0 from cluster 192.168.207.128:7000 Connecting to node 192.168.207.128:7000: OK Connecting to node 192.168.207.130:7004: OK Connecting to node 192.168.207.130:7003: OK Connecting to node 192.168.207.128:7002: OK Connecting to node 192.168.207.128:7001: OK Connecting to node 192.168.207.130:7006: OK Connecting to node 192.168.207.130:7005: OK >>> Sending CLUSTER FORGET messages to the cluster... >>> 192.168.207.128:7000 as replica of 192.168.207.130:7006 >>> SHUTDOWN the node. |
1
2
3
4
5
6
7
8
9
|
redis@slave01:~ /redis7004 $ bin /redis-cli -c -h 192.168.207.130 -p 7003 192.168.207.130:7003> cluster nodes b502b7efac704e92ac439933a42edf613f908fea 192.168.207.130:7006 master - 0 1444283726189 8 connected 0-5460 f3ba7c62307e0321f5d21310d14027c403c73907 192.168.207.130:7005 slave 568669e03c61b3c4edc31643dcb47e85bc2c3e23 0 1444283728207 6 connected 7dae923773125c5956605fac6159412850b100b3 192.168.207.128:7000 slave b502b7efac704e92ac439933a42edf613f908fea 0 1444283725182 8 connected #看下他的主节点是7006 0becd52cb1fa1a69f8d3135dd98f87cd8ccf9d78 192.168.207.128:7002 slave 8490b5e84bf0359871ea3fa55b97bb9877be0512 0 1444283729216 4 connected 568669e03c61b3c4edc31643dcb47e85bc2c3e23 192.168.207.128:7001 master - 0 1444283728207 2 connected 10923-16383 8490b5e84bf0359871ea3fa55b97bb9877be0512 192.168.207.130:7003 myself,master - 0 0 4 connected 5461-10922 |
1
2
3
4
5
6
7
8
9
10
|
redis@slave01:~ /redis7004 $ bin /redis-cli -c -h 192.168.207.128 -p 7002 #连接到7002从节点 192.168.207.128:7002> cluster nodes 8490b5e84bf0359871ea3fa55b97bb9877be0512 192.168.207.130:7003 master - 0 1444285238614 4 connected 5461-10922 0becd52cb1fa1a69f8d3135dd98f87cd8ccf9d78 192.168.207.128:7002 myself,slave 8490b5e84bf0359871ea3fa55b97bb9877be0512 0 0 3 connected #7002的主节点是7003 568669e03c61b3c4edc31643dcb47e85bc2c3e23 192.168.207.128:7001 master - 0 1444285240634 2 connected 10923-16383 b502b7efac704e92ac439933a42edf613f908fea 192.168.207.130:7006 master - 0 1444285237604 8 connected 0-5460 7dae923773125c5956605fac6159412850b100b3 192.168.207.128:7000 slave b502b7efac704e92ac439933a42edf613f908fea 0 1444285239625 8 connected f3ba7c62307e0321f5d21310d14027c403c73907 192.168.207.130:7005 slave 568669e03c61b3c4edc31643dcb47e85bc2c3e23 0 1444285239625 6 connected |
1
2
3
4
5
6
7
8
9
10
11
12
13
|
192.168.207.128:7002> cluster failover #这就是切换命令 OK 192.168.207.128:7002> cluster nodes 8490b5e84bf0359871ea3fa55b97bb9877be0512 192.168.207.130:7003 slave 0becd52cb1fa1a69f8d3135dd98f87cd8ccf9d78 0 1444285980634 9 connected #变成slave了 0becd52cb1fa1a69f8d3135dd98f87cd8ccf9d78 192.168.207.128:7002 myself,master - 0 0 9 connected 5461-10922 #变成master了 568669e03c61b3c4edc31643dcb47e85bc2c3e23 192.168.207.128:7001 master - 0 1444285977604 2 connected 10923-16383 b502b7efac704e92ac439933a42edf613f908fea 192.168.207.130:7006 master - 0 1444285976594 8 connected 0-5460 7dae923773125c5956605fac6159412850b100b3 192.168.207.128:7000 slave b502b7efac704e92ac439933a42edf613f908fea 0 1444285979624 8 connected f3ba7c62307e0321f5d21310d14027c403c73907 192.168.207.130:7005 slave 568669e03c61b3c4edc31643dcb47e85bc2c3e23 0 1444285978615 6 connected 192.168.207.128:7002> |