kafka2.2源码分析之Log日志存储

概述

Log由一系列LogSegment组成,每个LogSegment都有一个base offset,表示该段中的第一条消息。

新的LogSegment会根据Log的配置策略来创建。配置策略控制了LogSegment的字节大小和创建的时间间隔。

 

成员变量

  • dir

LogSegment的创建目录。

  • LogStartOffset

可以暴露给client端的最早offset。LogStartOffset可以通过以下方式更新:

  1. 用户调用deleteRecordsRequest删除日志
  2. broker的log retention
  3. broker的日志截断

LogStartOffset用于以下情形:

日志删除。nextOffset小于Log的LogSegment的LogSegment可以被删除。如果active segment被删除了,也有可能触发日志回滚。

在ListOffsetRequest中会返回Log的LogStartOffset。为了避免OffsetOutofRange异常,必须确保logStartOffset <= Log的highWatermarker。

  • activeSegment

指的是该 Log 管理的 segments 中那个最新的 segment(这里叫做活跃的 segment),一个 Log 中只会有一个活跃的 segment,其他的 segment 都已经被持久化到磁盘了;

  • logEndOffset

表示下一条消息的 offset,实际上就是activeSegment的下一个偏移量

@threadsafe
class Log(@volatile var dir: File,
          @volatile var config: LogConfig,
          @volatile var logStartOffset: Long,
          @volatile var recoveryPoint: Long,
          scheduler: Scheduler,
          brokerTopicStats: BrokerTopicStats,
          val time: Time,
          val maxProducerIdExpirationMs: Int,
          val producerIdExpirationCheckIntervalMs: Int,
          val topicPartition: TopicPartition,
          val producerStateManager: ProducerStateManager,
          logDirFailureChannel: LogDirFailureChannel) extends Logging with KafkaMetricsGroup {

  /* The earliest offset which is part of an incomplete transaction. This is used to compute the
   * last stable offset (LSO) in ReplicaManager. Note that it is possible that the "true" first unstable offset
   * gets removed from the log (through record or segment deletion). In this case, the first unstable offset
   * will point to the log start offset, which may actually be either part of a completed transaction or not
   * part of a transaction at all. However, since we only use the LSO for the purpose of restricting the
   * read_committed consumer to fetching decided data (i.e. committed, aborted, or non-transactional), this
   * temporary abuse seems justifiable and saves us from scanning the log after deletion to find the first offsets
   * of each ongoing transaction in order to compute a new first unstable offset. It is possible, however,
   * that this could result in disagreement between replicas depending on when they began replicating the log.
   * In the worst case, the LSO could be seen by a consumer to go backwards.
   */
  @volatile var firstUnstableOffset: Option[LogOffsetMetadata] = None

  /* Keep track of the current high watermark in order to ensure that segments containing offsets at or above it are
   * not eligible for deletion. This means that the active segment is only eligible for deletion if the high watermark
   * equals the log end offset (which may never happen for a partition under consistent load). This is needed to
   * prevent the log start offset (which is exposed in fetch responses) from getting ahead of the high watermark.
   */
  @volatile private var replicaHighWatermark: Option[Long] = None

  /* the actual segments of the log */
  private val segments: ConcurrentNavigableMap[java.lang.Long, LogSegment] = new ConcurrentSkipListMap[java.lang.Long, LogSegment]

  // Visible for testing
  @volatile var leaderEpochCache: Option[LeaderEpochFileCache] = None

 /**
   * The active segment that is currently taking appends
   */
  def activeSegment = segments.lastEntry.getValue

 /**
   * The offset metadata of the next message that will be appended to the log
   */
  def logEndOffsetMetadata: LogOffsetMetadata = nextOffsetMetadata

  /**
   * The offset of the next message that will be appended to the log
   */
  def logEndOffset: Long = nextOffsetMetadata.messageOffset

追加日志

添加records到Log的active segmenet,必要时滚动创建新的 segment

该方法会为record赋值一个offset,然而如果assignOffsets参数为false,我们只是检查已经存在的offset是否有效。

该方法的主要流程如下:

  • 对写入的消息进行检测,主要是检查消息的大小及 crc 校验;
  •  迭代每个record,并设置每个record的offset从当前Log的LEO处开始递增,并对msg做进一步的检验;
  • 每条 msg 都会有一个对应的时间戳记录,如果timestamp 的类型设置为日志追加时间,则将logAppendTime设置为当前时间;
  • 判断Segment是否满了,如果满了进行日志滚动,创建新的LogSegment;
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值