Redis源码解析:23sentinel(四)故障转移流程

Redis哨兵源码解析:故障转移流程详解
本文详细介绍了Redis Sentinel的故障转移流程,包括六个主要状态:WAIT_START, SELECT_SLAVE, SEND_SLAVEOF_NOONE, WAIT_PROMOTION, RECONF_SLAVES, UPDATE_CONFIG。在每个状态中,解释了其功能、处理函数及状态转换的条件,阐述了哨兵如何选择新主节点、重新配置从节点等关键步骤,展示了哨兵在故障恢复过程中的决策逻辑。" 112813592,10544501,Android ZIP解压缩与进度跟踪,"['Android开发', 'ZIP', '文件处理']

十:故障转移流程中的状态转换

         当哨兵针对某个主节点进行故障转移时,该主节点的故障转移状态master->failover_state,要依次经历下面六个状态:

SENTINEL_FAILOVER_STATE_WAIT_START

SENTINEL_FAILOVER_STATE_SELECT_SLAVE

SENTINEL_FAILOVER_STATE_SEND_SLAVEOF_NOONE

SENTINEL_FAILOVER_STATE_WAIT_PROMOTION

SENTINEL_FAILOVER_STATE_RECONF_SLAVES

SENTINEL_FAILOVER_STATE_UPDATE_CONFIG

 

         在哨兵的“主函数”sentinelHandleRedisInstance中,通过sentinelFailoverStateMachine函数进行故障转移状态的转换。它的代码如下:

void sentinelFailoverStateMachine(sentinelRedisInstance *ri) {
    redisAssert(ri->flags & SRI_MASTER);

    if (!(ri->flags & SRI_FAILOVER_IN_PROGRESS)) return;

    switch(ri->failover_state) {
        case SENTINEL_FAILOVER_STATE_WAIT_START:
            sentinelFailoverWaitStart(ri);
            break;
        case SENTINEL_FAILOVER_STATE_SELECT_SLAVE:
            sentinelFailoverSelectSlave(ri);
            break;
        case SENTINEL_FAILOVER_STATE_SEND_SLAVEOF_NOONE:
            sentinelFailoverSendSlaveOfNoOne(ri);
            break;
        case SENTINEL_FAILOVER_STATE_WAIT_PROMOTION:
            sentinelFailoverWaitPromotion(ri);
            break;
        case SENTINEL_FAILOVER_STATE_RECONF_SLAVES:
            sentinelFailoverReconfNextSlave(ri);
            break;
    }
}

        

         下面分别讲解每个状态及其处理函数。

 

1:SENTINEL_FAILOVER_STATE_WAIT_START

         上一章讲过,在哨兵的“主函数”sentinelHandleRedisInstance中,调用sentinelStartFailoverIfNeeded函数判断是否可以开始一次故障转移流程。当条件满足后,就会调用sentinelStartFailover函数,开始新一轮的故障转移流程。在该函数中,就会将该主节点的故障转移状态置为SENTINEL_FAILOVER_STATE_WAIT_START。

         一旦哨兵开始一次故障转移流程时,该哨兵第一件事就是向其他所有哨兵发送”is-master-down-by-addr”命令进行拉票。然后就是调用sentinelFailoverWaitStart函数处理当前状态。

         sentinelFailoverWaitStart函数的代码如下:

void sentinelFailoverWaitStart(sentinelRedisInstance *ri) {
    char *leader;
    int isleader;

    /* Check if we are the leader for the failover epoch. */
    leader = sentinelGetLeader(ri, ri->failover_epoch);
    isleader = leader && strcasecmp(leader,server.runid) == 0;
    sdsfree(leader);

    /* If I'm not the leader, and it is not a forced failover via
     * SENTINEL FAILOVER, then I can't continue with the failover. */
    if (!isleader && !(ri->flags & SRI_FORCE_FAILOVER)) {
        int election_timeout = SENTINEL_ELECTION_TIMEOUT;

        /* The election timeout is the MIN between SENTINEL_ELECTION_TIMEOUT
         * and the configured failover timeout. */
        if (election_timeout > ri->failover_timeout)
            election_timeout = ri->failover_timeout;
        /* Abort the failover if I'm not the leader after some time. */
        if (mstime() - ri->failover_start_time > election_timeout) {
            sentinelEvent(REDIS_WARNING,"-failover-abort-not-elected",ri,"%@");
            sentinelAbortFailover(ri);
        }
        return;
    }
    sentinelEvent(REDIS_WARNING,"+elected-leader",ri,"%@");
    ri->failover_state = SENTINEL_FAILOVER_STATE_SELECT_SLAVE;
    ri->failover_state_change_time = mstime();
    sentinelEvent(REDIS_WARNING,"+failover-state-select-slave",ri,"%@");
}

         当前哨兵,在调用sentinelStartFailover函数发起故障转移流程时,会将当前选举纪元sentinel.current_epoch记录到ri->failover_epoch中。因此,本函数首先根据ri->failover_epoch,调用函数sentinelGetLeader得到本界选举的结果leader。如果本界选举尚无人获得超过半数的选票,则leader为NULL;

         如果当前哨兵还没有赢得选举,并且主节点标志位中没有设置SRI_FORCE_FAILOVER标记,说明当前哨兵还没有获得足够的选票,暂时不能继续进行接下来的故障转移流程,需要直接返回。

         但是如果超过一定时间之后,当前哨兵还是没有赢得选举,则会终止当前的故障转移流程,因此如果当前距离开始故障转移的时间超过election_timeout,则调用函数sentinelAbortFailover,终止本次故障转移流程。

         如果当前哨兵最终赢得了选举,则更新故障转移的状态,置ri->failover_state属性为下一个状态:SENTINEL_FAILOVER_STATE_SELECT_SLAVE,并更新ri->failover_state_change为当前时间;

 

2:SENTINEL_FAILOVER_STATE_SELECT_SLAVE

         当故障转移状态转换为SENTINEL_FAILOVER_STATE_SELECT_SLAVE时,就需要在下线主节点的所有下属从节点中,按照一定的规则,选择一个从节点使其成为新的主节点。

         该状态下的处理函数为sentinelFailoverSelectSlave,该函数的代码如下:

void sentinelFailoverSelectSlave(sentinelRedisInstance *ri) {
    sentinelRedisInstance *slave = sentinelSelectSlave(ri);

    /* We don't handle the timeout in this state as the function aborts
     * the failover or go forward in the next state. */
    if (slave == NULL) {
        sentinelEvent(REDIS_WARNING,"-failover-abort-no-good-slave",ri,"%@");
        sentinelAbortFailover(ri);
    } else {
        sentinelEvent(REDIS_WARNING,"+selected-slave",slave,"%@");
        slave->flags |= SRI_PROMOTED;
        ri->promoted_slave = slave;
        ri->failover_state = SENTINEL_FAILOVER_STATE_SEND_SLAVEOF_NOONE;
        ri->failover_state_change_time = mstime();
        sentinelEvent(REDIS_NOTICE,"+failover-state-send-slaveof-noone",
            slave, "%@");
    }
}

         该函数首先调用函数sentinelSelectSlave选择一个符合条件的从节点;

         如果没有合适的从节点,则调用sentinelAbortFailover直接终止本次故障转移流程;

         如果找到了合适的从节点slave,则首先将标记SRI_PROMOTED增加到该从节点的标志位中;并使主节点实例的ri->promoted_slave指针指向该从节点实例,并将故障转移状态置为SENTINEL_FAILOVER_STATE_SEND_SLAVEOF_NOONE;然后更新ri->failover_state_change_time为当前时间;

 

         函数sentinelSelectSlave用于在下线主节点的所有从节点实例中,按照一定的规则选择一个从节点。该函数的代码如下:

sentinelRedisInstance *sentinelSelectSlave(sentinelRedisInstance *master) {
    sentinelRedisInstance **instance =
        zmalloc(sizeof(instance[0])*dictSize(master->slaves));
    sentinelRedisInstance *selected = NULL;
    int instances = 0;
    dictIterator *di;
    dictEntry *de;
    mstime_t max_master_down_time = 0;

    if (master->flags & SRI_S_DOWN)
        max_master_down_time += mstime() - master->s_down_since_time;
    max_master_down_time += master->down_after_period * 10;

    di = dictGetIterator(master->slaves);
    while((de = dictNext(di)) != NULL) {
        sentinelRedisInstance *slave = dictGetVal(de);
        mstime_t info_validity_time;

        if (slave->flags & (SRI_S_DOWN|SRI_O_DOWN|SRI_DISCONNECTED)) continue;
        if (mstime() - slave->last_avail_time > SENTINEL_PING_PERIOD*5) continue;
        if (slave->slave_prior
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值