Raft 协议在 Nacos 中的具体实现（Leader 选举、日志复制）源码详解

最新推荐文章于 2025-11-24 11:43:51 发布

原创最新推荐文章于 2025-11-24 11:43:51 发布 · 826 阅读

14 ·

CC 4.0 BY-SA版权

文章标签：

#nacos #raft #Leader选举 #日志复制 #学习

Nacos 专栏收录该内容

13 篇文章

订阅专栏

在 Nacos 中，Raft 协议的实现是其 CP 模式（强一致性） 的核心机制，主要用于 配置管理模块（Config Module） 的数据同步与一致性保障。Nacos 并未直接使用 etcd 的 Raft 库，而是基于 Raft 论文思想，自研了一套轻量级的 Raft 实现，通过 HTTP 协议进行节点通信。

本文将从 源码角度 深入解析 Nacos 中 Raft 协议的 Leader 选举 与 日志复制 两大核心流程，结合关键类、方法调用链和代码片段，带你彻底理解其实现原理。

一、前置知识：Nacos Raft 的整体架构

模块位置：com.alibaba.nacos.core.distributed.raft
通信方式：基于 HTTP 的自定义 RPC（非 gRPC）
数据存储：
- 日志：{nacos.home}/data/protocol/raft/{group}/log.data
- 快照：{nacos.home}/data/protocol/raft/{group}/snapshot/
核心角色类：
- RaftPeer：表示一个 Raft 节点（Follower/Leader/Candidate）
- RaftCore：Raft 的核心调度引擎
- LogProcessor：日志提交后的状态机处理器（如更新配置）
- RaftStore：日志和快照的持久化存储

二、Leader 选举源码详解

1. 触发选举的入口：`GlobalTaskScheduler`

Nacos 使用定时任务调度器启动选举超时机制。

// com.alibaba.nacos.core.distributed.raft.RaftCore#start
public void start() {
    GlobalExecutor.submitLeaderTask(new MasterElectionCore());
    GlobalExecutor.submitHeartBeat(new HeartBeatCore());
}

其中 MasterElectionCore 是选举核心任务。

2. 选举超时逻辑：`MasterElectionCore`

class MasterElectionCore implements Runnable {
    @Override
    public void run() {
        try {
            // 如果当前是 Follower 且长时间未收到心跳，触发选举
            if (peers.isLeader(null) || peers.getLeader() == null) {
                Loggers.RAFT.info("No leader is available, start leader election.");
                requestVote();
            }
        } catch (Exception e) {
            Loggers.RAFT.error("Exception while master election", e);
        }
    }
}

peers.isLeader(null)：判断当前是否为 Leader。
若无 Leader 或自己不是 Leader，且超时未收到心跳，则调用 requestVote() 发起投票。

⏱️ 选举超时时间默认为 500ms ~ 1000ms 随机值，防止脑裂。

3. 发起投票：`RaftCore#requestVote`

public synchronized void requestVote() {
    // 提升 term
    long term = peers.term.incrementAndGet();
    
    // 变为 Candidate
    RaftPeer local = peers.get(NetUtils.localServer());
    local.voteFor = local.ip;
    local.state = RaftPeer.State.CANDIDATE;
    local.term.set(term);

    Loggers.RAFT.info("leader timeout, start voting, me: {}", local.ip);

    // 向其他节点广播投票请求
    final long start = System.currentTimeMillis();
    List<CompletableFuture<Void>> votes = new ArrayList<>();
    for (final String server : peers.allServers()) {
        if (NetUtils.localServer().equals(server)) {
            continue;
        }
        final String url = "http://" + server + ":8848/nacos/v1/raft/vote";
        // 构造投票请求参数
        Map<String, String> params = new HashMap<>();
        params.put("term", String.valueOf(term));
        params.put("candidate", local.ip);
        params.put("voteGranter", local.ip);

        // 异步发送 HTTP 请求
        votes.add(HttpAsyncClient.INSTANCE.get(url, params).thenRun(() -> {
            // 收到投票响应
        }));
    }

    // 等待多数节点响应
    CompletableFuture.allOf(votes.toArray(new CompletableFuture[0]))
        .thenRun(() -> {
            // 统计得票数
            int voteCount = 1; // 自己投自己
            for (RaftPeer peer : peers.all()) {
                if (peer.term.get() == term && peer.voteFor.equals(local.ip)) {
                    voteCount++;
                }
            }
            // 如果获得多数票，成为 Leader
            if (voteCount > peers.all().size() / 2) {
                becomeLeader();
            }
        });
}

关键点解析：

term 自增：每个 Candidate 提出更高 term。
广播 RequestVote：通过 HTTP 向所有其他节点发送 /nacos/v1/raft/vote。
异步等待多数响应：使用 CompletableFuture 并行处理。
多数派判定：voteCount > N/2 才能成为 Leader。

4. 接收投票请求：`VoteRequestHandler`

// com.alibaba.nacos.core.distributed.raft.processor.impl.CandidateHandler
public class CandidateHandler extends RequestHandler {

    @Override
    public HttpServerResponse handle(HttpRequest request, HttpServerResponse response) {
        String termStr = request.getParam("term");
        String candidateIP = request.getParam("candidate");
        long term = Long.parseLong(termStr);

        RaftPeer local = peers.get(NetUtils.localServer());
        
        // 如果收到的 term 更高，且日志不落后，则投票
        if (term > local.term.get()) {
            local.term.set(term);
            local.voteFor = candidateIP;
            local.state = RaftPeer.State.FOLLOWER;
            // 重置选举定时器
            resetLeaderTimeout();
            return HttpResponseBuilder.create().entity("OK").build();
        }
        
        return HttpResponseBuilder.create().entity("DENY").build();
    }
}

投票条件：
1. term >= local.term
2. 日志“至少一样新”（Nacos 中简化为 term 比较）
投票后重置选举超时，防止重复选举。

5. 成为 Leader：`becomeLeader()`

private void becomeLeader() {
    RaftPeer local = peers.get(NetUtils.localServer());
    local.state = RaftPeer.State.LEADER;
    local.leader = local.ip;
    
    // 启动心跳任务
    GlobalExecutor.submitHeartBeat(new HeartBeatCore());
    
    Loggers.RAFT.info("Became the LEADER: {}", local.ip);
}

设置状态为 LEADER
启动 HeartBeatCore 定时发送心跳

三、日志复制源码详解

1. 写请求入口：`RaftController#write`

// com.alibaba.nacos.naming.controllers.RaftController
@PostMapping("/write")
public ResponseEntity<String> write(@RequestBody String value) {
    try {
        // 提交日志条目
        raftCore.submit(value);
        return ResponseEntity.ok("OK");
    } catch (Exception e) {
        return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).build();
    }
}

所有写请求最终都会调用 raftCore.submit()。

2. 日志提交：`RaftCore#submit`

public synchronized boolean submit(String value) {
    // 只有 Leader 能处理写请求
    if (!isLeader()) {
        // 转发给 Leader
        return forwardToLeader(value);
    }

    // 封装为日志条目
    LogEntry logEntry = new LogEntry();
    logEntry.setTerm(peers.get(NetUtils.localServer()).term.get());
    logEntry.setData(value);
    logEntry.setType(LogEntry.Type.COMMAND);

    // 写入本地日志
    logStore.write(logEntry);

    // 复制日志到其他节点
    replicate(logEntry);

    return true;
}

非 Leader 节点：调用 forwardToLeader() 转发请求。
Leader 节点：写本地日志 → 调用 replicate() 复制。

3. 日志复制：`replicate()`

private void replicate(LogEntry entry) {
    for (String server : peers.allServers()) {
        if (server.equals(NetUtils.localServer())) {
            continue;
        }
        // 获取该节点的 nextIndex
        int nextIndex = getNextIndex(server);
        LogEntry prevLog = logStore.read(nextIndex - 1);

        // 构造 AppendEntries 请求
        Map<String, Object> request = new HashMap<>();
        request.put("leaderId", NetUtils.localServer());
        request.put("term", peers.get(NetUtils.localServer()).term.get());
        request.put("prevLogIndex", nextIndex - 1);
        request.put("prevLogTerm", prevLog != null ? prevLog.getTerm() : 0);
        request.put("entries", Collections.singletonList(entry));
        request.put("leaderCommit", commitIndex);

        // 发送 HTTP 请求
        String url = "http://" + server + ":8848/nacos/v1/raft/heartbeat";
        HttpAsyncClient.INSTANCE.post(url, request, new WriteCallback() {
            @Override
            public void onSuccess(String result) {
                // 更新 nextIndex 和 matchIndex
                matchIndex.put(server, entry.getIndex());
                nextIndex.put(server, entry.getIndex() + 1);
                
                // 检查是否可以提交
                checkCommit();
            }

            @Override
            public void onFail(Throwable e) {
                // 重试或回退 nextIndex
                retryReplicate(server);
            }
        });
    }
}

prevLogIndex / prevLogTerm：用于一致性检查。
异步回调：成功后更新 matchIndex，失败则重试。

4. 接收 AppendEntries：`FollowerHandler`

// com.alibaba.nacos.core.distributed.raft.processor.impl.FollowerHandler
public class FollowerHandler extends RequestHandler {

    @Override
    public HttpServerResponse handle(HttpRequest request, HttpServerResponse response) {
        Map<String, Object> body = parseBody(request);
        long prevLogIndex = (Long) body.get("prevLogIndex");
        long prevLogTerm = (Long) body.get("prevLogTerm");
        List<LogEntry> entries = (List<LogEntry>) body.get("entries");

        RaftPeer local = peers.get(NetUtils.localServer());

        // 一致性检查
        LogEntry prevLog = logStore.read(prevLogIndex);
        if (prevLog == null || prevLog.getTerm() != prevLogTerm) {
            return HttpResponseBuilder.create().status(400).entity("Log not matched").build();
        }

        // 追加新日志（覆盖冲突日志）
        for (LogEntry entry : entries) {
            if (logStore.read(entry.getIndex()) == null) {
                logStore.write(entry);
            } else {
                // 日志冲突，覆盖
                logStore.deleteFrom(entry.getIndex());
                logStore.write(entry);
            }
        }

        // 更新 commitIndex
        long leaderCommit = (Long) body.get("leaderCommit");
        if (leaderCommit > commitIndex) {
            commitIndex = Math.min(leaderCommit, logStore.getLastIndex());
            // 提交日志到状态机
            applyLogToStateMachine();
        }

        return HttpResponseBuilder.create().entity("OK").build();
    }
}

日志冲突处理：若 prevLog 不匹配，拒绝复制。
覆盖机制：Leader 强制覆盖 Follower 的冲突日志。
更新 commitIndex：并尝试应用日志。

5. 提交日志：`checkCommit()` 与 `applyLogToStateMachine()`

private void checkCommit() {
    List<Long> matchIndexList = new ArrayList<>(matchIndex.values());
    Collections.sort(matchIndexList);
    int N = matchIndexList.size();
    long newCommitIndex = matchIndexList.get(N - (N / 2 + 1)); // 中位数

    if (newCommitIndex > commitIndex) {
        commitIndex = newCommitIndex;
        applyLogToStateMachine();
    }
}

private void applyLogToStateMachine() {
    while (lastApplied < commitIndex) {
        lastApplied++;
        LogEntry entry = logStore.read(lastApplied);
        // 通知状态机处理器（如配置管理）
        for (LogProcessor processor : logProcessors) {
            processor.onApply(entry);
        }
    }
}

多数派确认：取 matchIndex 的中位数作为可提交索引。
应用到状态机：触发 LogProcessor（如更新 Config 数据）

四、关键类图与调用链总结

1. Leader 选举调用链

GlobalExecutor → MasterElectionCore.run() 
    → RaftCore.requestVote() 
        → HTTP POST /vote 
            → CandidateHandler.handle()
                → 投票逻辑
                    → becomeLeader() → 启动心跳

2. 日志复制调用链

RaftController.write() 
    → RaftCore.submit() 
        → replicate() 
            → HTTP POST /heartbeat 
                → FollowerHandler.handle()
                    → 日志追加 + commit 更新
                        → applyLogToStateMachine() → 更新配置

五、源码级总结

机制	源码实现要点
Leader 选举	基于定时任务 + term 递增 + HTTP 投票 + 多数派判定
日志复制	Leader 写本地 → 广播 AppendEntries → Follower 检查 prevLog → 追加日志 → Leader 提交
一致性保证	多数派复制 + 日志匹配 + 强 Leader 覆盖
状态机应用	`LogProcessor.onApply()` 回调更新配置数据
持久化	`LogStore` 写磁盘，支持快照