问题背景说明
在学习redis的自动故障转移过程中,发现所有redis进程都部署在宿主机中时可以实现failover,但是当将各个实例部署到docker中时,发现启动之后conf文件中识别到的地址并非宿主机地址,导致无法正常通信.
今天简单阅读了一下redis文档的sentinel部分,发现有以下说明:
Sentinel, Docker, NAT, and possible issues
Docker uses a technique called port mapping: programs running inside Docker containers may be exposed with a different port compared to the one the program believes to be using. This is useful in order to run multiple containers using the same ports, at the same time, in the same server.
Docker is not the only software system where this happens, there are other Network Address Translation setups where ports may be remapped, and sometimes not ports but also IP addresses.
Remapping ports and addresses creates issues with Sentinel in two ways:
- Sentinel auto-discovery of other Sentinels no longer works, since it is based on hello messages where each Sentinel announce at which port and IP address they are listening for connection. However Sentinels have no way to understand that an address or port is remapped, so it is announcing an information that is not correct for other Sentinels to connect.
- Slaves are listed in the INFO output of a Redis master in a similar way: the address is detected by the master checking the remote peer of the TCP connection, while the port is advertised by the slave itself during the handshake, however the port may be wrong for the same reason as exposed in point 1.
Since Sentinels auto detect slaves using masters INFO output information, the detected slaves will not be reachable, and Sentinel will never be able to failover the master, since there are no good slaves from the point of view of the system, so there is currently no way to monitor with Sentinel a set of master and slave instances deployed with Docker, unless you instruct Docker to map the port 1:1.
For the first problem, in case you want to run a set of Sentinel instances using Docker with forwarded ports (or any other NAT setup