Kafka集群核心概念与架构
1 ) 概述
Kafka天然支持分布式架构,单节点也属于单节点集群。集群部署依赖ZooKeeper进行协调管理(如Broker注册、Leader选举)。
从Kafka 1.x版本开始,部分数据(如Offset、Partition元数据)已从ZooKeeper迁移至Kafka自身,但集群协调仍由ZooKeeper负责。
2 )集群本质特性
Kafka在设计上天然支持集群模式,即使单节点也属于单节点集群形式。实际生产环境需部署多节点实现高可用,演示环境可采用伪集群模式(单机多进程模拟分布式场景)。
3 )ZooKeeper依赖演进
早期版本(0.x)严重依赖ZooKeeper存储元数据(如offset、partition信息)。从1.x版本开始逐步迁移至Kafka内部管理,但集群协调仍依赖ZooKeeper。当前版本(3.x)通过KIP-500提案逐步替代ZooKeeper。
4 )集群关键要素:
- Broker标识:通过
broker.id(整型数字)唯一标识节点,如0、1、2。 - 节点通信:所有Broker连接同一个ZooKeeper集群即视为同一Kafka集群。
- 高可用架构:
- Partition数据分片存储在不同Broker
- 副本机制(Replication)保障数据冗余
技术演进:Kafka 0.x版本重度依赖ZooKeeper,1.x后逐步解耦,但2.x版本仍保留集群协调依赖。
伪集群部署实战
伪集群特点:单机多进程模拟多节点,需隔离以下配置:
# 节点1配置 (server-1.properties)
broker.id=1
listeners=PLAINTEXT://:9092
log.dirs=/tmp/kafka-logs-1
# zookeeper.connect=localhost:2181
# 节点2配置 (server-2.properties)
broker.id=2
listeners=PLAINTEXT://:9093 # 修改端口
log.dirs=/tmp/kafka-logs-2 # 修改日志目录
# 节点3配置 (server-3.properties)
broker.id=3
listeners=PLAINTEXT://:9094
log.dirs=/tmp/kafka-logs-3
伪集群关键技术要点
- 不同节点需配置独立端口(9092/9093/9094)
- 每个节点分配专属日志目录(避免IO冲突)
- 共用ZooKeeper集群实现状态同步
启动命令:
# 分别启动三个节点
kafka-server-start.sh config/server-1.properties
kafka-server-start.sh config/server-2.properties
kafka-server-start.sh config/server-3.properties
验证集群状态:
# 查看Broker列表
kafka-broker-api-versions.sh --bootstrap-server localhost:9092
验证节点状态
ps aux | grep kafka-server # 应显示三个进程
生产环境差异:真实集群需跨物理机部署,避免单点故障;伪集群仅用于开发测试
副本机制深度解析
1 )核心作用
通过日志多副本复制(Replication)实现:
- 数据高可用(故障自动恢复)
- 读写负载均衡
- 分区容错性
概括来说,通过数据冗余实现高可用与故障恢复
2 ) 运作原理:
- 副本因子(Replication Factor):
- 定义每个Partition的副本数量(建议≥3)
- 创建Topic时指定:
kafka-topics.sh --create --topic orders \ --partitions 3 --replication-factor 3 \ --bootstrap-server localhost:9092
- 副本分布:
- Leader副本处理读写请求
- Follower副本异步同步数据
- 分区均匀分布在集群Broker上
3 ) 故障恢复流程:
4 ) 副本分布原则:
- Partition主副本(Leader)均匀分布在不同Broker
- 副本因子(Replication Factor)决定副本数量
- ISR(In-Sync Replicas)维护同步副本集
5 ) 拓扑示例:
Broker1 [P0-Leader, P1-Follower, P2-Follower]
Broker2 [P0-Follower, P1-Leader, P2-Follower]
Broker3 [P0-Follower, P1-Follower, P2-Leader]
6 ) 副本配置实践:
创建副本因子为3的Topic
./kafka-topics.sh --create \
--bootstrap-server localhost:9092 \
--replication-factor 3 \ # 关键参数
--partitions 3 \
--topic orders
7 ) 副本状态验证:
./kafka-topics.sh --describe --topic orders
输出关键字段解析:
Partition:分区编号Leader:当前主副本节点IDReplicas:所有副本节点IDIsr:同步中的副本节点ID
集群配置验证与监控
查看Topic元数据:
kafka-topics.sh --describe --topic orders \
--bootstrap-server localhost:9092
输出示例:
Topic: orders Partition: 0 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: orders Partition: 1 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1
Topic: orders Partition: 2 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
关键指标:
- Leader:当前负责读写的Broker
- Replicas:所有副本所在Broker
- ISR(In-Sync Replicas):已同步的副本集合
工程示例:1
1 ) 方案1:基础生产者-消费者实现
// producer.service.ts
import { Injectable } from '@nestjs/common';
import { Kafka, Producer, ProducerRecord } from 'kafkajs';
@Injectable()
export class KafkaProducer {
private producer: Producer;
constructor() {
const kafka = new Kafka({
brokers: ['localhost:9092', 'localhost:9093', 'localhost:9094'],
clientId: 'order-service',
});
this.producer = kafka.producer();
}
async connect() {
await this.producer.connect();
}
async sendOrderEvent(orderData: object) {
const record: ProducerRecord = {
topic: 'orders',
messages: [
{ key: 'order_created', value: JSON.stringify(orderData) },
],
acks: -1, // 确保所有ISR副本确认
};
return this.producer.send(record);
}
}
// consumer.service.ts
import { Injectable, OnModuleInit } from '@nestjs/common';
import { Kafka, Consumer, EachMessagePayload } from 'kafkajs';
@Injectable()
export class KafkaConsumer implements OnModuleInit {
private consumer: Consumer;
async onModuleInit() {
const kafka = new Kafka({
brokers: ['localhost:9092', 'localhost:9093', 'localhost:9094'],
groupId: 'payment-group'
});
this.consumer = kafka.consumer({ groupId: 'payment-group' });
await this.consumer.connect();
await this.consumer.subscribe({ topic: 'orders', fromBeginning: true });
await this.consumer.run({
eachMessage: async ({ topic, partition, message }: EachMessagePayload) => {
console.log(`[Payment] Received: ${message.value.toString()}`);
// 业务处理逻辑
},
});
}
}
2 ) 方案2:事务消息处理
// transactional.producer.ts
import { Injectable } from '@nestjs/common';
import { Kafka, Producer } from 'kafkajs';
@Injectable()
export class TransactionalProducer {
private producer: Producer;
constructor() {
const kafka = new Kafka({
brokers: [/*...*/],
transactionTimeout: 30000,
});
this.producer = kafka.producer({
transactionalId: 'order-transactional-producer',
maxInFlightRequests: 1,
});
}
async executeOrderTransaction(orderData: object) {
const transaction = await this.producer.transaction();
try {
// 1. 发送订单事件
await transaction.send({
topic: 'orders',
messages: [{ value: JSON.stringify(orderData) }],
});
// 2. 数据库操作(原子性保证)
await database.updateInventory(orderData.items);
// 3. 提交事务
await transaction.commit();
} catch (error) {
await transaction.abort();
throw error;
}
}
}
3 ) 方案3:Schema注册与Avro序列化
// avro.producer.ts
import { Injectable } from '@nestjs/common';
import { Kafka, Producer } from 'kafkajs';
import { SchemaRegistry, SchemaType } from '@kafkajs/confluent-schema-registry';
@Injectable()
export class AvroProducer {
private producer: Producer;
private registry: SchemaRegistry;
constructor() {
this.registry = new SchemaRegistry({ host: 'http://schema-registry:8081' });
const kafka = new Kafka({ brokers: [/*...*/] });
this.producer = kafka.producer();
}
async registerOrderSchema() {
return this.registry.register({
type: SchemaType.AVRO,
schema: JSON.stringify({
type: 'record',
name: 'Order',
fields: [
{ name: 'id', type: 'int' },
{ name: 'product', type: 'string' },
{ name: 'quantity', type: 'int' }
]
})
});
}
async sendAvroMessage(schemaId: number, orderData: object) {
const encodedValue = await this.registry.encode(schemaId, orderData);
await this.producer.send({
topic: 'avro-orders',
messages: [{ value: encodedValue }],
});
}
}
工程示例:2
1 ) 方案1:基础生产者-消费者模型
// producer.service.ts
import { Injectable } from '@nestjs/common';
import { ClientKafka, MessagePattern, Payload } from '@nestjs/microservices';
@Injectable()
export class OrderProducer {
constructor(private readonly client: ClientKafka) {}
async sendOrderEvent(orderId: string) {
this.client.emit('order_created', {
id: orderId,
timestamp: Date.now()
});
}
}
// consumer.service.ts
@Injectable()
export class OrderConsumer {
@MessagePattern('order_created')
handleOrder(@Payload() message: any) {
console.log(`Processed order: ${message.id}`);
}
}
// kafka.config.ts
import { KafkaOptions, Transport } from '@nestjs/microservices';
export const kafkaConfig: KafkaOptions = {
transport: Transport.KAFKA,
options: {
client: {
brokers: ['localhost:9092', 'localhost:9093', 'localhost:9094'],
},
consumer: {
groupId: 'order-service',
}
}
};
2 ) 方案2:事务性消息处理
// transactional.producer.ts
import { Kafka } from 'kafkajs';
@Injectable()
export class TransactionProducer {
private kafka = new Kafka({
brokers: process.env.KAFKA_BROKERS.split(','),
transactionTimeout: 30000,
});
async sendPaymentEvent(paymentData: object) {
const producer = this.kafka.producer();
await producer.connect();
const transaction = await producer.transaction();
try {
await transaction.send({
topic: 'payments',
messages: [{ value: JSON.stringify(paymentData) }],
});
await transaction.commit();
} catch (error) {
await transaction.abort();
throw new Error('Transaction failed');
}
}
}
3 ) 方案3:动态分区负载均衡
// partition-aware.consumer.ts
import { Partitioners } from 'kafkajs';
@Injectable()
export class PartitionConsumer {
private kafka = new Kafka({
brokers: ['kafka1:9092', 'kafka2:9093'],
});
async startConsumer() {
const consumer = this.kafka.consumer({
groupId: 'partition-group',
partitionAssigners: [
({ cluster }) => {
// 自定义分区分配策略
return ({ topics }) => {
return topics.flatMap(topic => {
const partitions = cluster.findTopicPartitionMetadata(topic);
return partitions.map(p => ({ topic, partitionId: p.partitionId }));
});
};
}
]
});
await consumer.connect();
await consumer.subscribe({ topic: 'sensor_data' });
await consumer.run({
eachMessage: async ({ partition, message }) => {
console.log(`Partition ${partition}: ${message.value}`);
},
});
}
}
Kafka周边配置优化
1 ) ZooKeeper连接配置(NestJS):
// main.ts
const app = await NestFactory.createMicroservice(AppModule, {
strategy: new KafkaServer({
client: {
brokers: ['kafka1:9092'],
zookeeper: {
hosts: 'zk1:2181,zk2:2181',
sessionTimeout: 30000
}
}
})
});
2 ) ISR调优参数:
unclean.leader.election.enable=false(禁止落后副本成为Leader)min.insync.replicas=2(最小同步副本数)
三种集群部署方案对比
| 方案 | 适用场景 | 优势 | 缺陷 |
|---|---|---|---|
| 单机伪集群 | 开发/测试环境 | 资源占用低,快速验证逻辑 | 无容错能力 |
| 跨主机集群 | 中小型生产环境 | 真实容错,线性扩展 | 运维复杂度中等 |
| Kubernetes Operator | 云原生环境 | 自动扩缩容,声明式配置 | 需K8s基础设施支持 |
选型建议:生产环境优先采用跨主机集群,云原生场景使用Strimzi Operator管理Kafka集群
Kafka集群管理命令手册
| 功能 | 命令示例 |
|---|---|
| 查看集群状态 | kafka-cluster.sh describe --bootstrap-server localhost:9092 |
| 分区重平衡 | kafka-reassign-partitions.sh --bootstrap-server localhost:9092 --execute |
| 监控ISR状态 | kafka-topics.sh --describe --under-replicated-partitions |
| 紧急副本选举 | kafka-leader-election.sh --bootstrap-server localhost:9092 --election-type PREFERRED |
关键运维配置
server.properties 核心参数
default.replication.factor=3 # 默认副本数量
min.insync.replicas=2 # 最小同步副本数
unclean.leader.election.enable=false # 禁止非ISR副本选举
auto.leader.rebalance.enable=true # 自动平衡Leader
数据可靠性黄金公式:当生产者配置 acks=all 且 min.insync.replicas=2 时,可保证至少两个副本确认写入,实现故障场景零数据丢失。
集群优化方向
-
容量规划
- 分区数 = 目标吞吐量 / 单分区吞吐
- 副本因子 = 故障容忍度 + 1(建议≥3)
-
监控关键指标
- Under-replicated Partitions(副本滞后)
- Active Controller Count(集群脑裂检测)
- Request Handler Idle Ratio(线程阻塞)
-
灾难恢复策略
- 定期备份
__consumer_offsets主题 - 配置跨机房机架感知(
broker.rack) - 启用日志压缩(cleanup.policy=compact)
- 定期备份
总结
本文系统梳理了Kafka集群部署与副本机制,重点包括:
- 集群架构本质:Broker-ID机制 + ZooKeeper协调
- 副本核心价值:通过数据冗余实现高可用(建议RF=3)
- NestJS集成方案:
- 基础生产消费
- 事务消息保障
- 自定义分区负载
- 生产级配置:ISR参数调优 + 跨节点部署
1244

被折叠的 条评论
为什么被折叠?



