Apache Kafka 自带了一个简化版的 Apache ZooKeeper,用于管理 Kafka 集群的元数据和状态信息。ZooKeeper 是一个分布式协调服务,广泛用于分布式系统中的配置管理、命名服务、分布式锁等场景。
在 Kafka 中,ZooKeeper 扮演着以下几个重要角色:
配置存储:Kafka 使用 ZooKeeper 存储集群的配置信息,包括主题(topic)和分区(partition)的分配、消费者组(consumer group)的偏移量(offset)等。这些配置信息被保存在 ZooKeeper 的数据节点(ZNode)中。
选举协调:Kafka 的每个分区都由一个 broker 负责作为其 leader,并有零个或多个 follower。ZooKeeper 用于协调 broker 之间的选举过程,确保每个分区都有可用的 leader。
偏移量管理:消费者可以通过 ZooKeeper 来存储和获取其在每个分区上的消费偏移量。ZooKeeper 跟踪每个消费者组的偏移量,并允许消费者在重新加入时从上次停止的位置继续消费。
需要注意的是,自带的 ZooKeeper 是用于管理 Kafka 内部的元数据和状态,而不是用于其他用途的通用 ZooKeeper 集群。如果需要在其他应用程序中使用 ZooKeeper,通常建议单独部署一个独立的 ZooKeeper 集群。
从 Kafka 2.8.0 版本开始,Kafka 引入了一种新的集群协调机制,称为 Kafka Raft Metadata Mode,它逐渐取代了依赖 ZooKeeper 的元数据管理。在新版本中,ZooKeeper 的依赖性将逐渐减弱,未来版本可能完全去除对 ZooKeeper 的依赖。
环境:
服务器:一台CentOS7.9
Kafka:kafka_2.13-2.7.1,其中依赖的zk版本为:zookeeper: "3.5.9"
Kafka:kafka_2.13-2.8.1,其中依赖的zk版本为:zookeeper: "3.5.9"
kafka配置文件,以节点1为例::
# config/server.properties
broker.id=1
listeners=PLAINTEXT://0.0.0.0:9092
advertised.listeners=PLAINTEXT://当前节点绝对IP:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/opt/kafka_cluster/node1/log
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=xxx.xxx.xxx.xxx:2181,xxx.xxx.xxx.xxx:2182,xxx.xxx.xxx.xxx:2183
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0
zk配置文件,以节点2为例::
# zookeeper.properties
dataDir=/opt/zk_cluster/node2/data
dataLogDir=/opt/zk_cluster/node2/dataLog
clientPort=2182
maxClientCnxns=0
admin.enableServer=false
initLimit=5
syncLimit=2
server.1=服务器绝对IP:2881:3881;2181
server.2=服务器绝对IP:2882:3882;2182
server.3=服务器绝对IP:2883:3883;2183
data目录手动创建myid文件(内容就是ZK节点唯一ID),以节点3为例:
3
ZK启动脚本
zk_node1_startup.sh
#!/bin/bash
exec nohup /opt/kafka_node1/bin/zookeeper-server-start.sh /opt/kafka_node1/config/zookeeper.properties > ./zk_node1.out 2>&1 &
tail -f ./zk_node1.out
zk_node2_startup.sh
#!/bin/bash
exec nohup /opt/kafka_node2/bin/zookeeper-server-start.sh /opt/kafka_node2/config/zookeeper.properties > ./zk_node2.out 2>&1 &
tail -f ./zk_node2.out
zk_node3_startup.sh
#!/bin/bash
exec nohup /opt/kafka_node3/bin/zookeeper-server-start.sh /opt/kafka_node3/config/zookeeper.properties > ./zk_node3.out 2>&1 &
tail -f ./zk_node3.out
Kafka启动脚本
kafka_node1_startup.sh
#!/bin/bash
exec nohup /opt/kafka_node1/bin/kafka-server-start.sh /opt/kafka_node1/config/server.properties > ./kafka_node1.out 2>&1 &
tail -f ./kafka_node1.out
kafka_node2_startup.sh
#!/bin/bash
exec nohup /opt/kafka_node2/bin/kafka-server-start.sh /opt/kafka_node2/config/server.properties > ./kafka_node2.out 2>&1 &
tail -f ./kafka_node2.out
kafka_node3_startup.sh
#!/bin/bash
exec nohup /opt/kafka_node3/bin/kafka-server-start.sh /opt/kafka_node3/config/server.properties > ./kafka_node3.out 2>&1 &
tail -f ./kafka_node3.out
SpringBoot集成Kafka
配置文件
spring:
kafka:
bootstrap-servers: 10.110.86.40:9192,10.110.86.40:9292,10.110.86.40:9392
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.apache.kafka.common.serialization.StringSerializer
properties:
max.block.ms: 8000
consumer:
group-id: ws-default-groupId
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
max-poll-records: 20
consumer-2:
properties:
max.poll.records: 1
max.poll.interval.ms: 600000
session.timeout.ms: 900000
heartbeat.interval.ms: 20000
配置类:
import lombok.extern.slf4j.Slf4j;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.common.serialization.ByteArrayDeserializer;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.env.Environment;
import org.springframework.kafka.config.ConcurrentKafkaListenerContainerFactory;
import org.springframework.kafka.core.ConsumerFactory;
import org.springframework.kafka.core.DefaultKafkaConsumerFactory;
import org.springframework.kafka.listener.ConsumerAwareListenerErrorHandler;
import org.springframework.kafka.listener.ContainerProperties;
import java.util.HashMap;
import java.util.Map;
@Slf4j
@Configuration
public class KafkaConfigExt {
@Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
@Value("${spring.kafka.consumer-2.properties.max.poll.records}")
private int consumer2MaxPollRecords;
@Value("${spring.kafka.consumer-2.properties.max.poll.interval.ms}")
private int consumer2MaxPollInterval;
@Value("${spring.kafka.consumer-2.properties.session.timeout.ms}")
private int consumer2SessionTimeout;
@Value("${spring.kafka.consumer-2.properties.heartbeat.interval.ms}")
private int consumer2HeartbeatInterval;
@Bean
public ConsumerAwareListenerErrorHandler consumerAwareErrorHandler() {
return (message, exception, consumer) -> {
log.error("kafka消费异常:exception【{}】,message【{}】,consumer【{}】", exception.getMessage(), message.getPayload(), consumer, exception);
return null;
};
}
@Bean
public ConsumerFactory<String, String> group2ConsumerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, consumer2MaxPollRecords);
props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, consumer2MaxPollInterval);
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, consumer2SessionTimeout);
props.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, consumer2HeartbeatInterval);
return new DefaultKafkaConsumerFactory<>(props);
}
@Bean
public ConcurrentKafkaListenerContainerFactory<String, String> group2KafkaFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(group2ConsumerFactory());
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL);
return factory;
}
}
客户端:
@KafkaListener(topics = {"${配.置.项.topic}"}, groupId = "${配.置.项.group-id}" ,errorHandler = "consumerAwareErrorHandler")
public void exec1(ConsumerRecord<String, Object> record) {
String val = (String) record.value();
}
@Value("${配.置.项.nack-sleep:300000}")
private Long nackSleepTime;
@KafkaListener(topics = {"${配.置.项.topic}"}, groupId = "${配.置.项.group-id}", concurrency = "${配.置.项.concurrency}", containerFactory = "group2KafkaFactory", errorHandler = "consumerAwareErrorHandler")
public void exec2(ConsumerRecord<String, Object> record, Acknowledgment ack) {
try {
String val = (String) record.value();
ack.acknowledge();
} catch (Throwable e) {
ack.nack(nackSleepTime);
}
}
手动确认也可以在配置文件中开启:
spring:
kafka:
bootstrap-servers:
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.apache.kafka.common.serialization.StringSerializer
properties:
max:
block:
ms: 8000
consumer:
group-id:
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
enable-auto-commit: false
auto-offset-reset: earliest
concurrency: 1
listener:
ack-mode: manual_immediate
4855






