深入解释 Kafka 的 offset 索引文件存储结构-优快云博客

本文链接：https://blog.youkuaiyun.com/daweias/article/details/144240091

让我深入解释 Kafka 的 offset 索引文件存储结构。我会从基础概念开始，逐步深入到具体实现细节。

Kafka Offset 索引文件的基础结构

首先，让我们理解索引文件的基本组成。每个索引文件本质上是一个二进制文件，由多个索引项（index entry）组成。每个索引项占用固定的 8 个字节，包含两个主要部分：

class IndexEntry {
    int relativeOffset;    // 相对偏移量，4字节
    int position;          // 物理位置，4字节
}

文件存储格式

让我们看看索引文件在磁盘上的具体存储格式：

public class IndexFileStructure {
    // 索引文件的读写实现
    public void writeIndexEntry(FileChannel channel, int relativeOffset, int position) {
        ByteBuffer buffer = ByteBuffer.allocate(8);
        // 写入相对偏移量（4字节）
        buffer.putInt(relativeOffset);
        // 写入物理位置（4字节）
        buffer.putInt(position);
        
        buffer.flip();
        channel.write(buffer);
    }
    
    public IndexEntry readIndexEntry(FileChannel channel, long entryPosition) {
        ByteBuffer buffer = ByteBuffer.allocate(8);
        channel.read(buffer, entryPosition);
        buffer.flip();
        
        return new IndexEntry(
            buffer.getInt(),  // 读取相对偏移量
            buffer.getInt()   // 读取物理位置
        );
    }
}

相对偏移量的设计

Kafka 使用相对偏移量而不是绝对偏移量来节省存储空间。让我们看看这是如何工作的：

public class OffsetCalculation {
    private final long baseOffset;  // 段文件的基准偏移量
    
    public int calculateRelativeOffset(long absoluteOffset) {
        // 将绝对偏移量转换为相对偏移量
        return (int)(absoluteOffset - baseOffset);
    }
    
    public long calculateAbsoluteOffset(int relativeOffset) {
        // 将相对偏移量转换回绝对偏移量
        return baseOffset + relativeOffset;
    }
}

这样的设计有两个主要优势：

使用 4 字节存储相对偏移量，而不是 8 字节存储绝对偏移量
段文件内的偏移量差值通常不会超过 int 的范围

索引项的写入过程

当消息写入日志文件时，索引项的写入是这样进行的：

public class IndexWriter {
    private final int indexInterval;  // 索引间隔，默认4KB
    private long currentPosition;     // 当前日志文件位置
    
    public void maybeAddIndex(long offset, long position) {
        // 只有当位置满足索引间隔时才写入索引
        if (position - currentPosition >= indexInterval) {
            writeIndex(offset, position);
            currentPosition = position;
        }
    }
    
    private void writeIndex(long offset, long position) {
        // 计算相对偏移量
        int relativeOffset = calculateRelativeOffset(offset);
        
        // 确保索引项按顺序写入
        if (lastIndexEntry != null && 
            relativeOffset <= lastIndexEntry.getRelativeOffset()) {
            throw new IllegalArgumentException("Offsets must be monotonically increasing");
        }
        
        // 写入索引项
        IndexEntry entry = new IndexEntry(relativeOffset, (int)position);
        writeIndexEntry(entry);
    }
}

二分查找的实现

由于索引项是有序的，我们可以使用二分查找来快速定位：

public class BinarySearch {
    public IndexEntry findOffset(long targetOffset) {
        // 将目标偏移量转换为相对偏移量
        int relativeTarget = calculateRelativeOffset(targetOffset);
        
        int start = 0;
        int end = (int)(indexFile.length() / 8) - 1;  // 每个索引项8字节
        
        while (start <= end) {
            int mid = (start + end) >>> 1;
            IndexEntry entry = readIndexEntry(mid * 8);  // 定位到索引项
            
            if (entry.getRelativeOffset() < relativeTarget) {
                start = mid + 1;
            } else if (entry.getRelativeOffset() > relativeTarget) {
                end = mid - 1;
            } else {
                return entry;
            }
        }
        
        // 返回最近的小于目标的索引项
        return readIndexEntry(end * 8);
    }
}

内存映射的优化

为了提高索引文件的读写性能，Kafka 使用内存映射文件：

public class MappedIndexFile {
    private final MappedByteBuffer mmap;
    
    public MappedIndexFile(File file, int size) {
        FileChannel channel = FileChannel.open(file.toPath(), 
            StandardOpenOption.READ, StandardOpenOption.WRITE);
            
        // 将文件映射到内存
        this.mmap = channel.map(FileChannel.MapMode.READ_WRITE, 0, size);
    }
    
    public void writeEntry(IndexEntry entry) {
        // 直接写入内存映射区域
        mmap.putInt(entry.getRelativeOffset());
        mmap.putInt(entry.getPosition());
    }
    
    public IndexEntry readEntry(int position) {
        // 从内存映射区域读取
        mmap.position(position);
        return new IndexEntry(
            mmap.getInt(),  // 相对偏移量
            mmap.getInt()   // 物理位置
        );
    }
}

索引文件的维护

索引文件需要定期维护以保持效率：

public class IndexMaintenance {
    private final int maxIndexSize;  // 最大索引文件大小
    
    public void truncate(long offset) {
        // 截断指定偏移量之后的索引项
        int relativeOffset = calculateRelativeOffset(offset);
        int position = binarySearch(relativeOffset);
        indexFile.truncate(position * 8);
    }
    
    public void compact() {
        // 删除过期的索引项
        // 这通常发生在日志段被删除时
        if (indexFile.length() > maxIndexSize) {
            // 执行压缩操作
            removeOldestEntries();
        }
    }
}