Kafka 同步机制关键点 2分钟讲明白

大博士.J

于 2025-03-14 14:43:28 发布

阅读量827

点赞数 33

文章标签： kafka

本文链接：https://blog.youkuaiyun.com/qq_24396737/article/details/146256865

版权

Apache Kafka 通过副本同步机制来保证数据的高可用性和可靠性。Kafka 的同步机制主要涉及以下几个核心概念：

副本（Replication）

Kafka 的每个 Partition 都会有多个副本（Replica），分为：

Leader 副本：负责处理生产者和消费者的所有请求。
Follower 副本：仅从 Leader 同步数据，不直接处理请求。

副本数由 replication.factor 参数配置。例如：

replication.factor=3  # 每个分区 3 个副本

示例

假设有一个 Partition-0，其副本分布如下：

Leader: Broker 1
Follower: Broker 2 (ISR)
Follower: Broker 3 (ISR)

如果 Broker 2 落后太多，它会被踢出 ISR：

Leader: Broker 1
Follower: Broker 3 (ISR)
Follower: Broker 2 (OSR, 数据滞后)

副本同步（Replica Synchronization）

Kafka 采用同步副本集（ISR, In-Sync Replica）机制：

ISR（同步副本集）：Leader 和已同步的 Follower 副本组成的集合。
OSR（滞后副本集）：落后较多，未能及时同步数据的 Follower。
AR（所有副本集）：包括 Leader 和所有 Follower。

当 Follower 副本能在 replica.lag.time.max.ms 时间内跟上 Leader，它就会被视为 ISR 成员。

示例

假设有一个 Partition-0，初始状态如下：

Leader: Broker 1
Follower: Broker 2 (ISR)
Follower: Broker 3 (ISR)

如果 Broker 2 落后太多，它会被踢出 ISR：

Leader: Broker 1
Follower: Broker 3 (ISR)
Follower: Broker 2 (OSR, 数据滞后)

数据同步流程

Kafka 采用 Leader-Follower 复制模型，Follower 通过 pull（拉取）的方式从 Leader 获取数据：

Producer 发送消息到 Leader。
Leader 将数据写入本地日志。
ISR 中的 Follower 轮询拉取数据。
Follower 将数据写入本地日志，并向 Leader 发送 ACK。
Leader 收到所有 ISR 的 ACK 后，提交数据（commit）。

最小同步副本（min.insync.replicas）

为了提高数据安全性，可以设置 min.insync.replicas，要求至少 N 个 ISR 副本收到数据后，Leader 才能确认消息：

min.insync.replicas=2

如果 ISR 低于 min.insync.replicas，Leader 拒绝写入，防止数据丢失。

示例

replication.factor=3
min.insync.replicas=2

情况 1

两个 Follower 都同步成功：
- Producer -> Leader -> Follower1, Follower2 （写入成功）

情况 2

一个 Follower 同步失败：
- Producer -> Leader -> Follower1 (ISR) （写入失败，不满足 min.insync.replicas）

Leader 选举机制

当 Leader 崩溃时，Kafka 需要选举新的 Leader：

优先选择 ISR 副本（保证数据一致性）。
如果没有 ISR 副本（仅 OSR），是否选举 OSR 取决于 unclean.leader.election.enable：
- false（默认）：不选 OSR，防止数据丢失（但可能无法选出 Leader）。
- true：允许 OSR 选举（可能数据不一致）。

示例配置

unclean.leader.election.enable=false  # 禁止不完全同步的副本成为 Leader

复制延迟（Replica Lag）

Kafka 允许一定的副本同步延迟（Replica Lag），但超出 replica.lag.time.max.ms，Follower 会被踢出 ISR：

replica.lag.time.max.ms=10000  # 超过 10s 未同步，踢出 ISR

ISR 过小 → 影响可用性（Leader 崩溃后无可选副本）。
ISR 过大 → 影响性能（Follower 过慢）。

事务复制（Transactional Replication）

Kafka 2.0+ 支持事务复制（Exactly-Once 语义，EOS）：

生产者使用事务 ID（transactional.id）绑定多个分区的事务。
Kafka 通过 read_committed 确保消费者只能消费提交的数据。

生产者事务示例

props.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "tx-id-123"); // 事务 ID
KafkaProducer<String, String> producer = new KafkaProducer<>(props);
producer.initTransactions();

try {
    producer.beginTransaction();
    producer.send(new ProducerRecord<>("topic", "key", "value"));
    producer.commitTransaction(); // 提交事务
} catch (Exception e) {
    producer.abortTransaction(); // 回滚事务
}

消费者端（带事务）

props.put(ConsumerConfig.ISOLATION_LEVEL_CONFIG, "read_committed"); // 只消费已提交的事务数据

完整配置示例

生产者端（Producer）

关键配置

acks=all：确保消息被所有 ISR 副本写入。
retries=Integer.MAX_VALUE：遇到临时故障时重试，避免消息丢失。
enable.idempotence=true：开启幂等性，防止因重试导致的消息重复。
max.in.flight.requests.per.connection=1：确保消息按顺序发送，避免乱序。

配置文件 (`producer.properties`)

acks=all
retries=2147483647
enable.idempotence=true
max.in.flight.requests.per.connection=1
request.timeout.ms=60000
delivery.timeout.ms=120000

示例代码：可靠的Kafka生产者实现

import org.apache.kafka.clients.producer.*;

import java.util.Properties;

public class ReliableProducer {
    public static void main(String[] args) {
        // 配置生产者属性
        Properties props = new Properties();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put(ProducerConfig.ACKS_CONFIG, "all"); // 等待所有 ISR 副本确认
        props.put(ProducerConfig.RETRIES_CONFIG, Integer.MAX_VALUE); // 无限重试
        props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, "true"); // 开启幂等性
        props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, "1"); // 避免乱序
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");

        KafkaProducer<String, String> producer = new KafkaProducer<>(props);

        // 发送消息
        for (int i = 0; i < 10; i++) {
            ProducerRecord<String, String> record = new ProducerRecord<>("reliable_topic", "key" + i, "value" + i);

            producer.send(record, (metadata, exception) -> {
                if (exception == null) {
                    System.out.println("Message sent to partition " + metadata.partition() + " at offset " + metadata.offset());
                } else {
                    System.err.println("Message send failed: " + exception.getMessage());
                }
            });
        }

        // 关闭生产者
        producer.close();
    }
}

Broker 端（服务器）

关键配置

replication.factor=3：设置副本数量为3，确保数据不会因为单点故障丢失。
min.insync.replicas=2：至少2个副本同步，保证消息写入安全。
log.flush.interval.messages=1 和 log.flush.interval.ms=1000：控制日志刷盘频率，避免崩溃导致的数据丢失。
unclean.leader.election.enable=false：禁止非ISR副本选举，避免数据丢失。

配置文件 (`server.properties`)

default.replication.factor=3
min.insync.replicas=2
log.flush.interval.messages=1
log.flush.interval.ms=1000
unclean.leader.election.enable=false

消费者端（Consumer）

关键配置

enable.auto.commit=false：关闭自动提交Offset，改为手动提交，确保消费成功后才提交，防止消息丢失。
auto.offset.reset=earliest：确保消费者从头读取未消费的消息。

配置文件 (`consumer.properties`)

enable.auto.commit=false
auto.offset.reset=earliest

示例代码：可靠的Kafka消费者实现

import org.apache.kafka.clients.consumer.*;

import java.time.Duration;
import java.util.Collections;
import java.util.Properties;

public class ReliableConsumer {
    public static void main(String[] args) {
        // 配置消费者属性
        Properties props = new Properties();
        props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put(ConsumerConfig.GROUP_ID_CONFIG, "reliable_group");
        props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false"); // 关闭自动提交 Offset
        props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest"); // 重新消费未处理消息
        props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
        props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");

        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
        consumer.subscribe(Collections.singletonList("reliable_topic"));

        // 轮询消息并手动提交 Offset
        try {
            while (true) {
                ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
                for (ConsumerRecord<String, String> record : records) {
                    System.out.println("Received message: " + record.value() + ", offset: " + record.offset());

                    // 处理成功后手动提交 Offset
                    consumer.commitSync();
                }
            }
        } finally {
            consumer.close();
        }
    }
}

总结

机制	作用
副本（Replication）	通过 `replication.factor` 维护多个副本，提高容灾能力
同步副本集（ISR）	维护与 Leader 同步的副本，保证数据一致性
Follower 拉取数据	采用 pull 模型，避免 Leader 负载过高
最小同步副本（min.insync.replicas）	控制至少多少个 ISR 副本同步后，Leader 才确认写入
Leader 选举	发生故障时，优先选择 ISR 副本作为新的 Leader
副本滞后检测	超过 `replica.lag.time.max.ms`，Follower 被踢出 ISR
事务复制	通过 `transactional.id` 实现 Exactly-Once 语义