zookeeper源码分析之恢复snapshot

最新推荐文章于 2025-01-17 07:38:39 发布

Mr.Gorgeous

最新推荐文章于 2025-01-17 07:38:39 发布

阅读量842

点赞数 1

分类专栏： zookeeper 文章标签：大数据分布式 zookeeper

本文链接：https://blog.youkuaiyun.com/weixin_42442768/article/details/110134663

版权

zookeeper之恢复snapshot

前言
源码分析
查看snapshot的可视化命令
总结

前言

本文是基于zookeeper集群启动过程分析（https://blog.youkuaiyun.com/weixin_42442768/article/details/109247622），对zk从磁盘中读取文件并恢复为内存中的zk数据结构这一过程进行源码分析，本文主要分析snapshot的反序列化过程，事务日志的恢复将在下一篇讲解。

源码分析

前文分析了QuorumPeer类的loadDataBase()方法，本文对其中的zkDb.loadDataBase()方法进行分析。

首先来看一下QuorumPeer类中的成员变量zkDb:

    /**
     * ZKDatabase is a top level member of quorumpeer
     * which will be used in all the zookeeperservers
     * instantiated later. Also, it is created once on
     * bootup and only thrown away in case of a truncate
     * message from the leader
     */
    private ZKDatabase zkDb;

该变量在QuorumPeer类初始化时进行了赋值，ZKDatabase中本文主要关注这几个变量：

    protected DataTree dataTree;
    protected ConcurrentHashMap<Long, Integer> sessionsWithTimeouts;
    protected FileTxnSnapLog snapLog;

DataTree是zk存储数据信息的数据结构，sessionWithTimeouts存储session信息，FileTxnSnapLog是辅助恢复快照和事务日志文件的类，具体内容在数据结构部分详解。

将磁盘中的文件以zkDatabase结构恢复到内存中

进入正题，ZKDatabase类的loadDataBase方法：

    public long loadDataBase() throws IOException {
   
        long zxid = snapLog.restore(dataTree, sessionsWithTimeouts, commitProposalPlaybackListener);
        initialized = true;
        return zxid;
    }

通过辅助类FileTxnSnapLog的对象snapLog进一步恢复数据，返回最新的zxid，跟进restore方法:

    /**
     * this function restores the server
     * database after reading from the
     * snapshots and transaction logs
     * @param dt the datatree to be restored
     * @param sessions the sessions to be restored
     * @param listener the playback listener to run on the
     * database restoration
     * @return the highest zxid restored
     * @throws IOException
     */
    public long restore(DataTree dt, Map<Long, Integer> sessions,
            PlayBackListener listener) throws IOException {
   
        long deserializeResult = snapLog.deserialize(dt, sessions);
        FileTxnLog txnLog = new FileTxnLog(dataDir);
        if (-1L == deserializeResult) {
   
            /* this means that we couldn't find any snapshot, so we need to
             * initialize an empty database (reported in ZOOKEEPER-2325) */
            if (txnLog.getLastLoggedZxid() != -1) {
   
                throw new IOException(
                        "No snapshot found, but there are log entries. " +
                        "Something is broken!");
            }
            /* TODO: (br33d) we should either put a ConcurrentHashMap on restore()
             *       or use Map on save() */
            save(dt, (ConcurrentHashMap<Long, Integer>)sessions);
            /* return a zxid of zero, since we the database is empty */
            return 0;
        }
        return fastForwardFromEdits(dt, sessions, listener);
    }

这里主要做了三件事情：

反序列化快照文件，恢复到zkDatabase中
如果没有找到快照文件，将zkDatabase中的DataTree和session信息生成一个快照落入磁盘
快速从事务日志中获取最新zxid返回（这部分将单独写一篇文章分析）

从最新快照文件中反序列化DataTree

跟到FileSnap类中的deserialize方法：

    /**
     * deserialize a data tree from the most recent snapshot
     * @return the zxid of the snapshot
     */
    public long deserialize(DataTree dt, Map<Long, Integer> sessions)
            throws IOException {
   
        // we run through 100 snapshots (not all of them)
        // if we cannot get it running within 100 snapshots
        // we should  give up
        List<File> snapList = findNValidSnapshots(100);
        if (snapList.size() == 0) {
   
            return -1L;
        }
        File snap = null;
        boolean foundValid = false;
        for (int i = 0, snapListSize = snapList.size(); i < snapListSize; i++) {
   
            snap = snapList.get(i);
            LOG.info("Reading snapshot " + snap);
            try (InputStream snapIS = new BufferedInputStream(new FileInputStream(snap));
                 CheckedInputStream crcIn = new CheckedInputStream(snapIS,

最低0.47元/天解锁文章