RocketMQ MappedFile 存储原理文件预热原理

原创

已于 2024-08-17 23:25:09 修改 · 1.1k 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#rocketmq #java

于 2022-09-01 17:24:12 首次发布

本文详细解析RocketMQ中MappedFile类的存储机制，包括文件映射管理、内存页预热及文件创建策略等内容，旨在帮助理解RocketMQ的高效存储设计。

RocketMQ底层存储文件有commitLog，consumerqueue，indexfile，checkpoint文件。其中使用到JDK新的IO方式NIO包对磁盘文件进行操作。MappedFile类就是使用NIO API对磁盘文件操作的包装类，mq使用这个类对文件进行写入、读取、刷盘等操作。在编程方式上对磁盘文件抽象出统一操作的工具类是很好的设计，提高开发效率。所以每一个log或queue，indexfile文件被创建时相同的都会在内存创建一个对应的文件内存映射管理对象mapfile。

学习mapfile是如何对磁盘文件进行管理的，是阅读mq对commitlog、queue文件高效存储设计机制的核心基础。mapfile类使用了nio api进行读写，所以需要一定的nio bytebuf类的操作基础知识。

MappedFile的类变量以及实例变量的作用

mq需要知道占用了多少内存以及当前有多少个映射文件，方便管理。所以MappedFile定义了以下类变量记录数据

// 操作系统数据页 4K，unix系列通常是这个大小
public static final int OS_PAGE_SIZE = 1024 * 4;
// mq总共分配的映射文件内存大小
private static final AtomicLong TOTAL_MAPPED_VIRTUAL_MEMORY = new AtomicLong(0);
// mq总共创建的内存文件映射数量
private static final AtomicInteger TOTAL_MAPPED_FILES = new AtomicInteger(0);

然后是每个mapfile所代表的映射文件的实例变量。mapfile分为同步刷盘和异步刷盘两种机制，为了实现异步刷盘特地开了3个指针，这样才能搞清楚：未提交数据、已提交未刷盘数据、已刷盘数据。然后由2个异步线程定时对数据进行提交和刷盘。

// 当前数据的写入位置指针，下次写数据从此开始写入
protected final AtomicInteger wrotePosition = new AtomicInteger(0);
// 当前数据的提交指针，指针之前的数据已提交到fileChannel，commitPos~writePos之间的数据是还未提交到fileChannel的
protected final AtomicInteger committedPosition = new AtomicInteger(0);
// 当前数据的刷盘指针，指针之前的数据已落盘，commitPos~flushedPos之间的数据是还未落盘的
private final AtomicInteger flushedPosition = new AtomicInteger(0);
// 文件大小，单位字节
protected int fileSize;
// 磁盘文件的内存文件通道对象
protected FileChannel fileChannel;
// 异步刷盘时数据先写入writeBuf，由CommitRealTime线程定时200ms提交到fileChannel内存，再由FlushRealTime线程定时500ms刷fileChannel落盘
protected ByteBuffer writeBuffer = null;
// 堆外内存池，服务于异步刷盘机制，为了减少内存申请和销毁的时间，提前向OS申请并锁定一块对外内存池，writeBuf就从这里获取
protected TransientStorePool transientStorePool = null;
// 文件名，通常会以此文件开始写入的字节命名
private String fileName;
// 文件起始的写入字节数，初始化时是Long.parseLong(fileName)
private long fileFromOffset;
// 磁盘文件对象
private File file;
// 磁盘文件的内存映射对象，同步刷盘时直接将数据写入到mapedBuf
private MappedByteBuffer mappedByteBuffer;
// 此文件最近一次追加数据的时间戳
private volatile long storeTimestamp = 0;

MappedFile的初始化方法

public void init(final String fileName, final int fileSize,
    final TransientStorePool transientStorePool) throws IOException {
    init(fileName, fileSize);
    // 如果选择了带内存池的构造方法则代表要异步刷盘
    this.writeBuffer = transientStorePool.borrowBuffer();
    this.transientStorePool = transientStorePool;
}
// 默认的构造方法
private void init(final String fileName, final int fileSize) throws IOException {
    this.fileName = fileName;
    this.fileSize = fileSize;
    this.file = new File(fileName);
    // fileFromOffset是代表文件的起始字节数
    this.fileFromOffset = Long.parseLong(this.file.getName());
    boolean ok = false;
    // 检查文件所在的目录
    ensureDirOK(this.file.getParent());
    try {
        // NIO API创建文件通道对象对文件操作
        this.fileChannel = new RandomAccessFile(this.file, "rw").getChannel();
        // mmap函数，对写入数据和读取数据实现OS到用户态的零拷贝
        this.mappedByteBuffer = this.fileChannel.map(MapMode.READ_WRITE, 0, fileSize);
        // 全局记录用了多少文件映射内存
        TOTAL_MAPPED_VIRTUAL_MEMORY.addAndGet(fileSize);
        // 全局记录开了多少文件通道对象
        TOTAL_MAPPED_FILES.incrementAndGet();
        ok = true;
    } catch (FileNotFoundException e) {
        log.error("Failed to create file " + this.fileName, e);
        throw e;
    } catch (IOException e) {
        log.error("Failed to map file " + this.fileName, e);
        throw e;
    } finally {
        if (!ok && this.fileChannel != null) {
            this.fileChannel.close();
        }
    }
}

MappedFile如何写入数据的

mapfile往文件追加数据的逻辑是在appendMessage函数，根据消息可以追加单条或者批量消息。所有文件数据写入入口都是下面的函数，CommitLog、ConsumerQueue，那么不同文件由于磁盘的文件数据组织方式不一样如何实现定制化的序列化存储？依靠AppendMessageCallback实现，不同的文件写入数据传入各自实现，在统筹逻辑中回调方法。

// 单条消息
public AppendMessageResult appendMessage(final MessageExtBrokerInner msg, final AppendMessageCallback cb) {
    return appendMessagesInner(msg, cb);
}
// MessageExtBatch批量消息
public A