kafka源码研究之分区状态

本文深入探讨Kafka的PartitionStateMachine,详细分析了分区状态的维护,包括成员变量、Controller选举、状态变更监听器的初始化,以及各种分区状态间的转换,如NewPartition到OnlinePartition,OfflinePartition到OnlinePartition等,揭示了Kafka如何处理分区的领导选举和ISR列表更新,以确保集群的稳定运行。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

      kafka的分区状态通过PartitionStateMachine来进行维护,下面通过源码来对这个类进行探究:

 

一   该类成员变量部分:

private val controllerContext = controller.controllerContext
private val controllerId = controller.config.brokerId
private val zkUtils = controllerContext.zkUtils
private val partitionState: mutable.Map[TopicAndPartition, PartitionState] = mutable.Map.empty
private val brokerRequestBatch = new ControllerBrokerRequestBatch(controller)
private val noOpPartitionLeaderSelector = new NoOpLeaderSelector(controllerContext)

private val stateChangeLogger = KafkaController.stateChangeLogger

this.logIdent = "[Partition state machine on Controller " + controllerId + "]: "

def startup() {
  initializePartitionState()
  triggerOnlinePartitionStateChange()

  info("Started partition state machine with initial state -> " + partitionState.toString())
}

该方法的作用请求成功的controller选举,当触发了所有的分区状态的变更之后,会注册一个状态变化的监听器

1  initializePartitionState()方法负责初始化分区的状态,分区状态在此方法里面会被存储一个Map结构:partitionState

2    triggerOnlinePartitionStateChange()方法负责具体的分区状态转换,其内部是通过handleStateChange()方法来实现,

handleStateChange()方法里面,是通过将目标状态和kafka分区的NewPartition、OnlinePartition等状态进行模式匹配;

在kafka分区状态转换中,有一下几种状态之间的转换:

1   NonExistentPartition -> NewPartition:

     将指定的副本从zk中加载到controller缓存

2   NewPartition -> OnlinePartition

    指定第一个存活的副本作为其leader以及所有的副本作为其isr列表

3   OnlinePartition,OfflinePartition -> OnlinePartition

   进行leader的重新选举以及isr列表

   OnlinePartition转化为OnlinePartition发生在当前集群的Leader分布不均衡,导致节点与节点之间的负载波动比较大,需要进行  Leader的重新选举,此时优先考虑AR列表的第一个副本作为其Lrader,或者发生在当前Leader准备下线,此时需要出发Leader的重新选举进行重新上线,此时需要考虑ISR列表中排在原始Leader后的的第一个副本为Leader

  OfflinePartition -> OnlinePartition 发生在所有的分区副本下线而又部分副本上线的时候,此时对于ISR和Leader的选举需要考虑

初始的ISR、AR以及当前Leader Broker之间的关系,优先选择ISR列表的第一个Live Broker,其次考虑AR的第一个Live Broker

4  NewPartition,OnlinePartition,OfflinePartition -> OfflinePartition

   只进行一个状态的标记

5  OfflinePartition -> NonExistentPartition

只进行一个状态的标记

 

 

private def handleStateChange(topic: String, partition: Int, targetState: PartitionState,
                              leaderSelector: PartitionLeaderSelector,
                              callbacks: Callbacks) {
  val topicAndPartition = TopicAndPartition(topic, partition)
  val currState = partitionState.getOrElseUpdate(topicAndPartition, NonExistentPartition)
  try {
    //检查前置状态是否为targetState
    assertValidTransition(topicAndPartition, targetState)
    targetState match {
      case NewPartition =>
        //将状态切换为NewPartition,切换为NewPartition是很简单的,只要将分区状态置为NewPartition即可
        partitionState.put(topicAndPartition, NewPartition)
        val assignedReplicas = controllerContext.partitionReplicaAssignment(topicAndPartition).mkString(",")
        stateChangeLogger.trace("Controller %d epoch %d changed partition %s state from %s to %s with assigned replicas %s"
                                  .format(controllerId, controller.epoch, topicAndPartition, currState, targetState,
                                          assignedReplicas))
        // post: partition has been assigned replicas
      case OnlinePartition =>
        partitionState(topicAndPartition) match {
          case NewPartition =>
            // initialize leader and isr path for new partition,初始化分区的leader和isr列表,通过
initializeLeaderAndIsrForPartition方法实现,在NewPartition转化为OnlinePartition的时候,并没有进行Leader的选举和选择isr列表,默认AR列表的第一个Live Broker为其Leader,Live Broker为其ISR,持久化至ZK
            initializeLeaderAndIsrForPartition(topicAndPartition)
          case OfflinePartition =>
            //OfflinePartition转化为OnlinePartition会进行Leader的选举和ISr的选择
            electLeaderForPartition(topic, partition, leaderSelector)
          case OnlinePartition => // invoked when the leader needs to be re-elected
            electLeaderForPartition(topic, partition, leaderSelector)
          case _ => // should never come here since illegal previous states are checked above
        }
        partitionState.put(topicAndPartition, OnlinePartition)
        val leader = controllerContext.partitionLeadershipInfo(topicAndPartition).leaderAndIsr.leader
        stateChangeLogger.trace("Controller %d epoch %d changed partition %s from %s to %s with leader %d"
                                  .format(controllerId, controller.epoch, topicAndPartition, currState, targetState, leader))
         // post: partition has a leader
      case OfflinePartition =>
        // should be called when the leader for a partition is no longer alive
        stateChangeLogger.trace("Controller %d epoch %d changed partition %s state from %s to %s"
                                  .format(controllerId, controller.epoch, topicAndPartition, currState, targetState))
        partitionState.put(topicAndPartition, OfflinePartition)
        // post: partition has no alive leader
      case NonExistentPartition =>
        stateChangeLogger.trace("Controller %d epoch %d changed partition %s state from %s to %s"
                                  .format(controllerId, controller.epoch, topicAndPartition, currState, targetState))
        partitionState.put(topicAndPartition, NonExistentPartition)
        // post: partition state is deleted from all brokers and zookeeper
    }
  } catch {
    case t: Throwable =>
      stateChangeLogger.error("Controller %d epoch %d initiated state change for partition %s from %s to %s failed"
        .format(controllerId, controller.epoch, topicAndPartition, currState, targetState), t)
  }
}

 

 

 

 

 

 

 

 

 

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值